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PREFACE 


Digital  Voice  Communication  is  a  pervasive  phenomenon 
in  our  society.  Without  even  noticing  it  we  carry  on  phone 
conversations  over  a  digital  channel.  So,  why  is  digital 
communication  so  widely  used?  One  reason  is  that  the 
digital  speech  signals  can  be  made  more  noise  immune  or  even 
secure  than  the  analog  signals.  Another  reason  is  the 
advanced  development  in  integrated  circuit  technology,  which 
allows  easier  implementation  of  digital  processing 
techniques.  Whatever  the  case,  the  field  of  digital 
communication  is  an  exciting  field  and  one  which  I  feel  is 
expanding.  Therefore,  I  am  glad  that  I  could  prepare  my 
thesis  under  this  topic. 

Linear  Predictive  Coding  is  one  facet  of  the  field  of 
digital  communication.  The  goal  of  the  coding  is  to  reduce 
the  bit  rate  of  the  signal  sent  over  the  communication 
channel.  The  system  I  developed  does  not  reduce  the  bit 
rate  very  much,  but  then  it  is  not  a  true  communication 
system.  It  is  a  computer  simulation  of  such  a  system,  which 
will  give  the  user  an  opportunity  to  explore  some  of  the 
ideas,  methods,  and  problems  of  LPC. 

Since  the  system  developed  here  is  a  tool,  the  most 
important  product  of  the  thesis  may  well  be  the  user's 
guide.  It  present?  the  programs  and  demonstrates  how  a 


student  may  actually  process  speech  through  an  LPC  system 
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ABSTRACT 


^y 

This  report  describes  a  system  which  processes  speech 
using  linear  predictive  methods.  The  system  is  a  software 
simulation  of  an  LPC  analyzer  and  synthesizer.  The  system 
consists  of  two  programs,  one  of  which  processes  the  speech 
to  generate  the  LPC  parameters,  and  another  which  processes 
these  parameters  to  resynthesize  the  speech.  An  important 
aspect  of  the  system  is  that  it  enables  the  user  to  select 
from  various  pitch  and  coefficient  analysis  methods.  It 
also  allows  the  user  to  vary  other  parameters  in  order  to 
simulate  other  changes  in  the  processing  scheme. 

To  test  the  operation  of  the  system,  a  regimen  of 
testing  was  performed  by  varying  the  different  parameters. 
A  separate  program  allows  a  simple  method  for  changing  all 
of  the  parameters  over  which  the  user  has  control.  These 
parameters  are  called  the  decision  variables  and  each  has  an 
allowable  range  of  values.  The  system  operated 
satisfactorily  over  all  values  of  the  decision  variables. 
The  flexibilty  exhibited  by  the  system  in  this  testing 
indicates  that  the  system  can  be  a  valuable  tool  for  the 
study  of  linear  predictive  coding  of  speech  in  the  Signal 
Processing  Laboratory  at  the  Air  Force  Institute  of 
Technology . 
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CHAPTER  I 


Introduction 


Background 

Communication,  and  in  particular,  digital  voice 
communication,  is  of  vital  concern  to  the  U.S.  Department  of 
Defense.  This  concern  is  founded  in  the  rt  j.’-ement  of  the 

military  to  maintain  command  and  con  1  over  great 

distances.  This  is  especially  apparent  i  the  need  for 

aircraft  to  maintain  contact  with  forces  <.  che  ground.  A 
problem  arises,  though,  when  many  aircraft  need  to  maintain 
contact  with  the  same  command  center.  Since  only  a  finite 
number  of  radio  frequencies  (channels)  are  available,  a 
method  of  maintaining  unambiguous  communication  is  needed. 
One  method  of  resolving  this  is  time  division  multiple 
access,  in  which  each  aircraft  is  allocated  a  certain  amount 

of  time  to  access  the  communication  channel.  The 

communication  system  is  arranged  so  that  each  communicator 
tranmits  and  receives  only  during  its  allocated  time  slot. 
Another  method  of  sharing  the  channel  is  frequency  division 
multiplexing,  in  which  each  aircraft  is  allocated  a  separate 
portion  of  the  radio  spectrum  available  in  the  communicaton 
channel.  In  either  method,  the  number  of  users  of  each 


channel  is  limited  by  both  the  available  bandwidth  of  the 
channel  and  the  bandwidths  of  the  users.  The  bandwidth  of 
the  channel  is  determined  by  the  nature  of  the  channel,  the 


geometry  and  physical  realization.  The  bandwidth  of  the 
users  is  a  function  of  the  bit  rate  of  the  message  to  be 
transmitted,  as  the  bit  rate  of  the  messages  increases,  the 
bandwidth  increases.  One  way  to  allow  for  more  users  is  to 
reduce  the  bit  rate  of  each  user.  Linear  Predictive  Coding 
(LPC)  offers  a  means  of  reducing  the  bit  rate  of  each  user 
when  the  message  is  voice  communication. 

The  standard  method  of  digital  voice  communication  is 
pulse  code  modulation  (PCM) ,  in  which  the  analog  voice 
signal,  or  waveform,  is  sampled  and  quantized.  Nyquist's 
sampling  theorem  assures  us  that  the  sampling  rate  must  be 
twice  that  of  the  highest  frequency  in  the  original  baseband 
signal.  High  quality  speech  requires  frequency  components  of 
up  to  3000  Hz  [Ref  12],  so  after  filtering,  sampling  is 
often  performed  at  8000  Hz.  Digitization  of  the  sampled 
speech  is  often  performed  by  quantizing  at  12  bits  per 
sample,  a  rate  which  has  proven  to  enable  high  quality 
speech  reproduction.  Such  a  system  would  require  a 
transmission  rate  of  96  kb/s.  Various  methods  of  waveform 
coders  are  capable  of  producing  high  quality  speech,  but 
only  at  rates  above  about  16  kb/s  [Ref  1], 

Linear  Predictive  Coding  (LPC)  is  a  method  of  digital 
speech  processing  which  reduces  the  required  bandwidth  of 
the  signal  by  reducing  the  bit  rate  required  for 
intelligible  communication.  Waveform  coders,  such  as  PCM, 
transmit  the  waveshape  of  the  signal,  whereas  LPC  makes  no 
attempt  to  maintain  the  waveshape  of  the  speech  signal. 


Instead,  parameters  which  describe  the  speech  are  determined 
and  are  transmitted  over  the  channel  to  be  used  to 
reconstruct  the  signal  at  the  receiver.  These  parameters 
may  be  the  prediction  coefficients,  which  determine  the 
digital  filter  used  to  reconstruct  the  speech,  and 
information  about  the  pitch  of  the  speech.  Proper  selection 
of  these  parameters  will  enable  fairly  high  quality  speech 
at  greatly  reduced  bit  rates.  A  common  model  used  to  diagram 
the  production  of  speech  at  the  receiver  is  shown  in  figure 


The  input  to  the  filter  is  either  a  quasi-per iodic 
sequence  of  impulses  spaced  at  the  glottal  pitch  period,  or 
a  random  noise  source.  When  the  input  to  the  filter  is  an 
impulse  the  voiced  portions  of  the  speech  such  as  the  vowel 
sounds  are  reproduced.  When  the  input  is  a  noise  source  the 
unvoiced  portions  of  speech  such  as  the  fricatives 
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Figure  1-1  A  model  for  the  production  of  speech. 


(s,sh,f,th)  are  reproduced.  The  gain  and  coefficients  of 
the  filter  are  determined  by  linear  predictive  analysis. 

Statement  of  the  Problem 

The  importance  of  LPC  is  indicated  by  the  existence  of 
an  Air  Force  standard  for  LPC  (LPC-10) .  As  LPC  becomes 
more  prevalent  a  need  exists  for  a  system  available  at  the 
Air  Force  Institute  of  Technology  which  can  demonstrate  some 
of  the  features  and  operations  of  LPC.  Most  implementatons 
of  LPC  are  in  hardware,  with  most  of  the  system  hidden 
within  a  "black  box."  These  factors  led  to  the  need  for  the 
development  of  a  software  model  which  would  offer  a  better 
opportunity  to  examine  the  system. 

A  number  of  algorithms  exist  which  can  be  used  to 
determine  the  filter  coefficients.  Among  these  are  the 

autocorrelation  and  the  covariance  method.  Also  available 
are  a  number  of  methods  of  pitch  detection  and  extraction.  A 
software  simulation  is  needed  which  would  incorporate  these 
various  algorithms  and  methods  into  a  single,  flexible 
model.  This  model  could  be  used  as  a  learning  device  and  as 
a  tool  for  further  study  of  LPC.  It  would  allow  the  student 
or  researcher  an  easy  means  to  investigate  the  software  of 
the  system  and  vary  the  algorithms  used,  consider  changes  of 
the  algorithm  parameters  and  examine  intermediate  results. 
It  would  also  provide  a  means  of  addressing  some  of  the 
problems  confronting  LPC,  in  particular,  the  noise  problem. 


Scope 


The  system  presented  in  this  thesis  is  designed 
especially  for  the  Signal  Processing  Laboratory  at  AFIT, 
where  it  can  operate  as  a  useful  tool  for  the  study  of  the 
general  LPC  method  of  coding  speech.  It  is  strictly 
software;  the  code  is  written  in  FORTRAN 5 ,  developed  on  and 
accesible  from  the  Data  General  Eclipse  S/250  computer  in 
the  laboratory.  It  uses  existing  hardware  and  software  for 
the  audio  interface.  It  is  meant  to  be  easily  used,  easily 
understood,  and  user  friendly.  It  should  be  easy  to  update, 
expand,  or  modify.  It  was  designed  to  run  as  close  to  real 
time  as  the  constraints  of  the  laboratory  would  permit. 

Overview  of  the  System 

The  system  is  divided  into  two  main  programs,  an 
analysis  program  and  a  synthesis  program.  This  format  was 
chosen  as  it  best  simulates  the  tranmitter  and  reciever 
nature  of  the  LPC  speech  communication  system.  The  inputs 
to  the  analysis  program  are  digitized  speech  and  the 
necessary  decision  variables  (these  will  be  explained 
later) .  The  outputs  of  this  program  are  the  LPC  parameters. 
The  synthesizer  uses  these  parameters  to  reproduce  the 
speech. 

The  flexibilty  of  the  system  is  afforded  by  the 
extensive  use  of  subroutines.  The  system  inputs  the  speech 
in  time  segments,  called  frames,  and  operates  on  these 
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segments  sequentially.  This  is  a  recurring  process,  and 
most  of  the  calculations  are  performed  on  each  frame.  The 
subroutine  structure  allows  easy  access  to  the  routines 
which  perform  certain  portions  of  the  calculations,  such  as 
pitch  detection,  coefficient  generation,  and  other 
systematically  used  operations.  For  instance,  the 
subroutines  which  perform  coefficient  generation  are  grouped 
together,  yet  only  one  subroutine  is  used  (although  it  is 
used  on  each  frame)  during  the  execution  of  the  program. 
The  other  subroutines  are  retained  for  the  case  where 
another  method  of  coefficient  generation  is  required.  The 
use  of  subroutines  makes  it  easy  to  expand  the  system  by 
adding  new  routines  which  offer  different  methods  of 
performing  the  required  calculations. 

The  LPC  analyzer  is  the  heart  of  the  system.  Most  of 
the  decision  variables  affect  the  operation  of  the  analyzer, 
because  they  determine  which  subroutines  will  be  used  to 
produce  the  necessary  parameters  for  transmission.  It  reads 
digitized  (PCM)  speech  from  a  contiguous  file  of  integer 
values.  It  scales  the  incident  speech  if  necessary,  places 
it  in  a  floating  point  form,  and  writes  it  to  an  array.  The 
parameters  (pitch  information,  predictor  coefficients,  and 
energy)  are  calculated  for  each  frame  and  then  written  to  a 
sequential  file  which  is  the  input  to  the  synthesis  program. 
This  file  acts  as  a  communication  channel  between 
transmitter  and  receiver  and  is  referred  to  as  the  channel 
file.  Before  any  speech  is  processed,  key  parameters  which 
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are  needed  by  the  synthesizer  are  written  to  the  channel 
file.  These  parameters  are  needed  by  the  synthesizer  so 
that  it  can  correctly  match  its  decoding  and  synthesis 
scheme  to  a  form  compatible  with  that  of  the  analyzer.  If 
the  forms  do  not  match,  the  LPC  parameters  will  be  read 
incorrectly  and  speech  will  be  impossible  to  reproduce. 

The  synthesizer  reads  the  information  from  the  channel 
file  and  processes  it  to  create  intelligible  speech.  It 
reads  the  pitch  data  and  the  length  of  the  speech  to  be 
proccessed.  It  then  generates  either  pseudorandom  noise  or 
a  pulse  train,  which  it  writes  to  a  array.  This  array  is 
the  input  to  the  digital  filter  which  is  described  by  the 
prediction  coefficients  read  from  the  channel.  The  output 
of  this  filter  is  written  to  a  contiguous  file  and  the 
system  then  processes  the  next  block  of  information.  After 
the  entire  channel  file  has  been  read,  the  output  speech  is 
scaled  so  that  it  may  be  listened  to  with  the  use  of  the 
"Audiohist"  or  the  "Audiomod"  program  prepared  by  a  previous 
student  [Ref  3]  and  available  as  a  utility  program  on  the 
system  in  the  Signal  Processing  Laboratory. 
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CHAPTER  II 


Linear  prediction  is  a  method  of  analyzing  a  speech 
waveform  so  that  the  complete  waveform  need  not  be 
transmitted  over  a  communication  channel.  A  linear 
prediction  system  takes  a  digital  speech  signal  and 
processes  it  so  that  only  the  "essence"  of  the  signal 
remains;  no  attempt  is  made  to  maintain  the  waveshape  of  the 
signal.  For  our  purpose,  the  essence  of  the  signal  is  a 
parametric  model  of  the  signal,  where  these  parameters  can 
be  used  to  reconstruct  the  signal.  The  linear  prediction 
system  consists  of  two  major  operations  or  processes.  One 
process  analyzes  the  incoming  speech  and  extracts  the 
relevant  parameters  and  transmits  these  over  a  communication 
channel.  Another  process  at  the  receiving  end  of  the 
channel  transforms  these  parameters  into  speech.  This 
transformation  is  based  on  a  time-varying  digital  filter 
with  predictor  coefficients  which  model  the  vocal  tract  of 
the  speaker  (see  figures  2-1  and  2-2).  Since  the  vocal 
tract  changes  shape  slowly  it  can  be  considered  fixed  over  a 
time  interval  on  the  order  of  10  ms,  and  the  digital  filter 
can  characterize  the  vocal  tract  over  this  short  interval 
(Ref  12].  The  input  to  this  filter  is  the  assumed 
excitation  of  the  actual  human  vocal  tract:  glottal  pulses 
occurring  every  pitch  period,  or  random  noise. 
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Fiqure  2-1  Cross  section  of  the  vocal  tract  showing  the  major 
anatomical  structures  involved  in  speech  production. 
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The  question  addressed  by  linear  prediction  is:  how  do  we 
find  these  predictor  coe f f ic ients? 

Speech,  of  course,  is  an  analog  process,  so  it  must  be 
digitized  before  it  can  be  processed  by  linear  prediction 
methods.  This  is  usually  accomplished  by  pulse  code 
modulation  (PCM)  in  which  the  analog  signal  is  quantized  in 
time  (sampled)  with  a  sampling  frequency  of  f  ,  and 
quantized  in  amplitude.  For  the  analog  waveform,  s(t),  the 
sampled  waveform  can  be  expressed  as 

s (nT)  *  s ( t) I  (2-1 ) 

|  t=nT 

where  T  is  the  time  between  samples  (T=l/f _) .  Since  f  ,  and 
therefore  T  remain  constant  (in  our  case  fg  *  8000Hz),  we 
can  write  s(nT)  as  s(n)  with  no  loss  of  generality. 

If  we  assume  that  the  signal,  s(n),  is  the  output  of  a 
system  (our  assumption  above  concerning  speech  at  the 
receiver  being  the  output  of  a  filter  allows  this)  with 
input  u(n),  then  that  signal  can  be  expressed  as  a  linear 
function  of  the  past  outputs  and  the  present  and  past 
inputs.  That  is,  the  output  can  be  predicted  by  a  linear 
combination  of  inputs  and  outputs.  Hence  the  description  of 
this  scheme  as  linear  prediction.  This  relation  is  written 
as 


s  ( 


aks(n-k)  +  G 


m)  ,bQ  *  1  (2-2) 


k*l  m»0 

where  ak  ,1  <.  k  <.  P,  bm  £  m  1  <3  and  G  are  the 
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parameters  of  the  system.  The  goal  of  the  linear  predictor 
is  to  determine  the  values  of  these  parameters. 

Entering  the  frequency  domain,  we  can  take  the 
z-transforra  of  both  sides  of  (2-2)  to  obtain  the  transfer 
function,  H(z),  of  the  digital  system.  The  transfer 
function  is  the  ratio  of  the  output  to  the  input  and  can  be 
expressed  as 


H  ( z)  = 


S  (z) 
U  (z) 


q 


k»l 


where  S(z)  is  the  z-transform  of  s(n)  and  U(z)  is  the 
z-transform  of  u(n). 

This  equation  describes  a  pole-zero  model  of  the 
system.  Variations  on  this  model  are  the  all-zero  model, 


where  a^O  #1  £ 

k  £  p; 

and  the 

all-pole  model. 

where 

bk=0  ,1  £  k  £  q. 

Histor ically. 

the  all-pole  method  of 

analysis  has  been 

by 

far  the 

most  widely  used 

method  of 

linear  prediction 

[Ref 

7].  For 

the  all-pole 

model  the 

equations  which  must  be  solved  form  a  linear  set,  whereas 
even  the  simplest  pole-zero  model  gives  a  set  of  non-linear 
equations  [Ref  10:472].  Since  the  all-pole  model  will 
greatly  simplify  the  calculation  of  the  coefficients,  we 
will  only  concern  ourselves  with  this  model.  Therefore,  the 
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transfer  function  of  interest  is 


G 

H  (  2)  =  - 

A  ( z) 


G 


(2-4) 


A ( z)  will  be  referred  to  as  the  inverse  filter,  and  the 
coefficients  a^  ,  1  £  k  <  p  will  be  referred  to  as  the 

predictor  coefficients. 

By  taking  an  inverse  z-transform,  we  return  to  the  time 
domain  and  get  the  relation 


s  (n) 


(n-k) 


+  G  u(n) 


(2-5) 


k*l 

From  this  equation  it  is  clearly  evident  that  the  output 
sequence,  s(n),  can  be  generated  with  only  one  input,  u(n) 
and  p  previous  outputs. 

If  we  assume  that  the  input  u(n)  is  unknown  [Ref  7],  we 
can  calculate  a  prediction  of  s(n),  s(n)  which  is  based 
strictly  on  ther  past  outputs.  This  assumption  gives  us  a 
result  which  is  independent  of  the  input  and  can  be  written 
as 


P 


s  (n)  - 

-  \  a^s (n-k) 

(2-6) 

k*l 

Now  we  will 

define  the 

predictor  error. 

e (n)  , 

as 

the 

difference 

between  the 

original  signal. 

s (n)  , 

and 

the 
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predicted  signal,  s(n).  That  is 


e(n)  =  s(n)  -  s(n)  =  s( 


a^s (r-k) 


(2-7) 


Given  s(n),  we  can  define  the  total  squared  error,  E,  as 


E  *  ^T^e2(n)  =  ^T^[s(n)  +  aks (n-k)  ] 


(2-8) 


By  definition,  the  most  accurate  predictor  coefficients 
result  in  the  least  error.  To  find  the  minimim  squared 
error,  we  set  the  derivative  to  zero. 

That  is 


,!  £  i  £  p 


(2-9) 


Equations  (2-8)  &  (2-9)  will  reduce  to  the  set  of 


equations 


Za*Zs<n' 

k=*l  n 


k) s(n-i) 


-£s(n)s(n-i)  ,1  £  i  £ 


p  (2-10) 


These  equations  are  called  the  normal  equations  [Ref  7]. 
Given  any  signal,  s(n),  (2-10)  forms  a  set  of  p  equations  in 
p  unknowns  which  can  be  solved  to  give  the  predictor 
coefficients  which  minimize  E.  The  parameter,  p,  is  the 
number  of  poles,  and  consequently  the  number  of  predictor 
coefficients  in  the  transfer  function.  Note  that  the  range 
of  summation  over  n  is  the  range  of  the  signal  for  which  the 
error  will  be  minimized  and  remains  unspecified. 


Autocorrelation  Method 

If  we  let  the  range  of  summation  over  n  be  from  -00  to 
+0O  ,  we  will  get  a  global  minimization  of  the  error  and 
equation  (2-10)  reduces  to 


P 


akR( i-k) 


k  =  l 

where  R(i)  is  the 


-R ( i)  1  <  i  <  p  (2-11) 

autocorrelation,  and  is  defined  as 


R(i) 


n=-CO 


s (n-i) 


(2-12) 


Since  the  signal  is  known  over  only  a  finite  duration,  we 
divide  the  signal  into  frames  and  assume  that  the  signal 
s(n)  is  identically  zero  outside  of  the  interval  0  <  n  < 

N— 1 .  A  suitable  way  to  express  this  is  as 

s(n)  =  s(n+N)  w(n)  (2-13) 

where  w(n)  is  a  window  function  which  is  identically  zero 
outside  of  the  interval  0  £  n  £  N-l .  This  windowing  process 

produces  frames  which  are  N  samples  wide.  Since  excessive 

errors  will  be  encountered  at  the  frame  boundaries  because 
of  our  drastic  assumption  of  zero  outside  of  the  frame,  we 
need  to  taper  the  edges  of  the  window  to  zero.  To 
accomplish  this  we  use  a  Hamming  window.  For  simplicity  of 
notation  we  will  drop  the  index  N  and  the  caret  and  speak 
only  of  the  signal  s(n)  which  is  properly  only  a  portion 

(one  frame)  of  the  complete  signal. 
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Note  that  R ( i )  is  an  even  function,  that  is 


R(i)  =  R(-i)  (2-14) 

The  coefficients  R(i-k)  form  an  autocorrelation  matrix. 
Because  R(i)  is  even  and  is  a  function  of  only  the 

difference  of  the  indices,  the  resultant  autocorrelation 
matrix  is  symmetric  and  all  of  the  elements  along  a  diagonal 
are  equal.  Such  a  matrix  is  called  symmetric  Toeplitz. 
This  fact  makes  the  linear  system  of  equations  easy  to  solve 
by  recursive  methods. 


Covariance  Method 


Another 

technique 

for 

producing 

the  predictor 

coef f ic ients 

is  called 

the 

covar iance 

method.  If  we 

minimize  the 

squared  error 

over 

the  finite 

interval  0  £  n  £ 

N-l,  we  get  the  set  of  equations 


P 


aRc (k, i) 


k  =  l 


c (0 , i) 


,1  <  i  <  p  (2-15) 


where  c(i,k)  is  called  the  covariance,  and  is  defined  as 


N-l 


c(i,k) 


s(n-i) 


s (n-k) 


(2-16) 


n*0 

Because  we  define  our  range  of  summation  over  a  finite 
interval,  we  need  not  window  the  signal,  but  as  we  will  see 
in  Chapter  4,  windowing  vastly  improves  the  results  of  the 
inversion  process  needed  to  calculate  the  predictor 


«*.***-.  -  i4*o*r 


The  covariance  terms  are  symmetric,  that  is 
c (i,k)  =  c  { k  ,  i) 


(2-17) 


and  make  up  a  symmetric  covariance  matrix.  However,  the 
terms  of  this  matrix  unlike  the  terms  of  the  autocorrelation 
matrix  are  not  equal  along  the  diagonals. 


Solution  Algorithms 

The  solution  involves  inverting  the  matrix  which 
describes  the  set  of  p  equations  to  be  solved.  A  number  of 
algorithms  exist  for  inverting  matrices  with  a  computer.  Of 
main  concern  in  developing  the  solution  algorithms  is  a  need 
for  simplicity,  ease  of  implementation  in  the  software,  and 
reduction  of  the  number  of  calculations.  The  Toeplitz 
nature  of  the  autocorrelation  matrix  makes  the  system  of  p 
linear  equations, 


^akR(i-k)  =  -R  (  i) 


1  <  i  <  p 


(2-11) 


easy  to  solve  and  reduces  the  number  of  computations 
required.  Levinson  developed  an  elegant  recursive  method 
for  solving  such  equations.  Durbin  further  expanded  on  the 
recursion  by  exploiting  the  fact  that  when  the  equation  is 
expanded  into  matrix  form,  the  right  side  of  the  equation  is 
contained  on  the  left  side  [Ref  7J. 

The  recursive  solution  attributed  to  Durbin  is  usually 
presented  as  [Ref  11] 
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1.  E  (0  )  =  R  (0  ) 


(2-18a) 


i-1 


2.  k.  =  -  [R  ( i 


i)  -  ^aj  (l"1)R(i-j)]/E(i-l)  ,1  <  i  <  p 

j=l  (2-18b) 


3  . 

•i,l) 

-  ki 

A 

a  (i) 

—  a 

H  • 

3j 

aj 

5. 

E(i)  = 

(1-k 

(2-18c) 


(1_1)+  k.a^J1_1)  ,1  <  j  <  i-1  (2-18d) 


l  1-3 


(2-18e) 


After  the  values  are  solved  recursively  for  i=l,2,...,p  the 
final  solution,  giving  the  p  predictor  coefficients,  is 
3j  =  3j(p)  1  <  j  <  P  (2-19) 

The  solution  is  unaffected  if  the  autocorrelation  values  are 
scaled  by  a  constant.  Usually,  the  autocorrelation  values 
are  normalized  by  dividing  them  by  R(0),  giving  normalized 
autocorrelations,  which,  except  for  R(0),  are  all  less  than 
one. 

A  by-product  of  the  recursive  method  is  E(i)  which  is 
the  predictor  error  for  a  predictor  of  order  i.  if  the 
autocorrelation  function  is  normalized,  this  error  value 
will  also  be  normalized.  This  parameter  is  important 
because  given  the  output  filter  described  by  the  p  predictor 
coefficients,  the  value  E(p)  is  proportional  to  the  gain 
required  to  reproduce  the  speech  signal.  In  the  prediction 
model,  this  parameter  will  represent  the  gain  of  the  output 
f ilter . 

The  covariance  method  also  uses  a  recursive  method  to 
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determine  the  coefficients  [Ref  10].  The  system  must  first 
be  initialized  by  setting  the  following: 


E  (0 )  =  c  (0 ) 

(2-20a) 

B (0 )  =  c(l) 

(2-20b) 

kx  =  -c(0)/c(l) 

(2-20c) 

a0(1)  =  1  ,  a1(1)  =  k. 

(2-20d) 

E(l)  =  [e  (0 )  -  k|B(l) 

(2-20e) 

The  recursive  equations  can  be  written  as: 


1.  b^"1*  =  1. 


(2-21a) 


n+1 

2.  G  *  1/B  ( n)  ^  R(i)  b 

j=l 


(n) 


(2-21b) 


■z 


3.  B(i-l)  =  )  R  ( i)  bj 

j=l 


( i-1) 


(2-21c) 


i-1 


4. 

5. 

6. 

7. 


=  1/B ( i-1)  ^  R(i)  3^ 


I 

j=0 

a  ^  ^  x  a  ^  ^  +  k  b 

aj  *  aj  +  KiDj 

(i) 


(i-1) 


(i-1) 


ai'"  =  ki 


E(i)  *  (1-k  ^  )E( i-1) 


(2-21d) 

(2-21e) 
(2-21 f) 
(2-21g) 


At  this  point  step  m  is  complete.  After  p  recursions, 


the 
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final  solution  is 


a.  =  a.(p)  ,1  <  i  <  p  (2-22) 

which  gives  the  predictor  coefficients  for  the  output  filter 
of  order  p.  The  parameter  E(p)  is  proportional  to  the  gain 
of  the  system. 

Summary 

Linear  Prediction  is  a  method  of  parameterizing  a 

signal.  Using  minimum  mean  square  error  techniques,  the 

procedure  generates  the  filter  coefficients  which  describe 
the  system  producing  the  signal.  For  speech,  this  system  is 
the  human  vocal  tract.  With  these  coefficients  and  a  valid 

excitation  to  the  output  filter,  the  speech  can  be 

reproduced  to  yield  intelligible  results. 
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aw 


CHAPTER  III 


Development  of  the  LPC  System 


To  attain  the  flexibility  required  in  this  linear 
predictive  coding  model,  the  system  is  divided  into  two 
separate  programs.  These  programs  are  the  LPC  analyser  and 
the  LPC  speech  synthesizer.  Flow  diagrams  of  these  programs 
are  presented  in  figures  3-1  and  3-2.  The  two  programs  are 
coupled  by  a  file  in  the  computer,  which  is  called  the 
"channel  file",  and  which  can  be  considered  as  a 
communication  channel.  The  analyzer  writes  speech 
information  to  the  channel,  the  vocoder  reads  this 
information  and  from  it  reproduces  a  synthesized  version  of 
the  original  speech.  Both  of  these  driving  programs  are 
composed  of  a  number  of  subroutines  which  perform  most  of 
the  calculations.  This  extensive  use  of  subroutines  is 
intended  to  make  the  system  more  flexible  as  well  as  easier 
to  understand.  For  the  most  part,  bookkeeping  is  performed 
by  the  main  programs,  whereas  most  of  the  LPC  calcuations 
are  performed  by  the  subroutines.  Intermediate  results  and 


parameters  can 

be 

examined 

by 

looking  at 

the  relevant 

subroutines . 

The 

method 

of 

calculation 

can  also  be 

examined.  This  can  be  done  by  looking  at  the  code  or  by 
placing  "type"  or  "write"  statements  into  the  subroutines. 
Adding  new  methods  simply  requires  the  addition  of  the 
proper  subroutines  and  their  corresponding  calling 
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Block  diagram  of  the  Analysis  program  (PREDICT) 


Figure  3-2  Block  diagram  of  the  Synthesis  program  (VOCODE) . 


statements  in  the  main  program. 


Further  flexibility  is  attained  by  the  use  of 
■decision  variables. "These  decision  variables  are  preset  by 
the  user  and  control  the  operation  of  the  main  programs. 
These  variables  select  different  prediction  methods,  vary 
the  number  of  poles  used  in  the  analysis,  and  affect  other 
scaling  or  test  parameters.  Two  methods  of  setting  the 
decision  variables  are  available  to  the  user.  The  easiest 
is  to  simply  run  the  program  and  wait  for  the  prompts.  The 
other  method  is  to  write  the  decision  variables  to  a  file, 
with  the  aid  of  a  program  (SETUP)  which  is  designed  strictly 
for  this  purpose.  This  method  is  preferred  and  is  useful  if 
the  user  wants  to  hear  different  segments  of  speech  without 
having  to  worry  about  the  decision  variables  selected. 
Because  the  file  is  self-contained,  the  user  need  not  be 
interrupted  by  any  prompts.  The  decision  variables  will  be 
identified  and  named  as  they  are  encountered  in  the 

description  of  the  system. 

The  modularity  of  the  system  helped  considerably  in  the 
construction  and  testing  of  the  programs  and  subroutines. 
Each  part  of  the  program  could  be  tested  independently 
before  it  was  consolidated  into  the  complete  program. 
Modularity  also  allowed  concurrent  developement  of  the 

analyzer  and  the  vocoder  because  results  from  the  analyzer 
could  be  tested  by  direct  application  of  the  vocoder.  This 
also  allowed  various  values  of  decision  variables  to  be 
tested  at  each  stage  of  the  developement. 
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The  input  to  the  LPC  analyzer  is  digitized  speech.  The 
analyzer  processes  this  signal  and  generates  the  parameters 
which  will  be  written  to  a  channel  file.  The  first  stage  of 
the  analyzer  creates  or  opens  the  files  to  which  information 
will  be  read  or  written.  Two  files  are  opened,  one  which 
contains  the  digitized  speech  to  be  coded,  and  another  which 
contains  the  decision  variables.  The  file  which  contains 
the  incoming  digitized  speech  must  be  contiguous  with  each 
block  containing  256  integer  valued  samples.  One  file,  the 
channel  file,  into  which  will  be  written  the  LPC  speech 
information,  is  created.  Next,  the  decision  variables  are 
determined,  either  by  prompts  from  the  terminal,  or  from  the 
file  containing  these  variables.  The  variables  pertaining 
to  the  operation  of  the  vocoder  [number  of  poles  in  the 
synthesis  filter  (POLES),  pre/de-emphasis  (NEMP) ,  unvoiced 
gain  factor  (UNGA) ,  and  the  shape  of  the  glottal  pulse 
(NGLT) ]  are  then  written  to  the  channel  file.  This  completes 
the  initialization  of  the  system.  The  rest  of  the  system  is 
a  large  loop  which  is  repeated  until  the  input  speech  data 
is  exhausted. 

The  first  order  of  business  within  the  loop  is  to  load 
a  large  array  with  five  blocks  of  speech  (each  block 
contains  256  samples)  from  the  contiguous  input  file.  This 
large  array  is  needed  because  this  data  is  written  to  two 
small  arrays,  one  used  for  pitch  detection,  and  one  used  for 


predictor  coefficient  generation.  Each  array  contains  one 
frame  of  speech.  The  length  of  each  frame  is  set  by  a 
decision  variable  (MAXFR) .  The  large  size  is  so  that  a  wide 
range  of  frame  sizes  can  be  used  by  the  analyzer. 

Using  counters  to  keep  track  of  where  the  process  is  in 


terms 

of 

blocks 

and 

array  members,  a  portion  of  the 

large 

array 

is 

written 

to 

a  smaller  array.  This  smaller 

array 

contains  one  frame  of  speech  to  be  processed  for  energy  and 
pitch.  This  frame  is  first  processed  by  the  energy 
subroutine.  This  subroutine  finds  the  energy  (sum  of  the 
squares)  in  the  frame  to  determine  if  the  data  can  be 
considered  silence  or  speech.  The  test  threshold  is  a 
decision  variable  (THRESH)  which  is  set  by  the  user.  The 
energy  of  the  frame  is  a  functon  of  its  length,  therefore, 
if  the  length  of  the  frame  is  changed,  the  decision  variable 
THRESH  should  be  changed  by  a  proportional  amount  to 
maintain  consistent  results.  If  the  threshold  is  not 
exceeded,  the  signal  is  considered  silence,  and  no  more 
calculations  need  be  made  on  this  frame.  The  subroutine  has 
a  memory  of  three  previous  energy  calculations.  The  need  for 
the  memory  will  become  apparent  when  the  nature  of  the  pitch 
detection  and  the  synchronous  nature  of  the  analysis  is 
discussed. 

If  the  frame  has  sufficient  energy  to  be  considered 
speech  and  not  background  noise,  the  pitch  is  then 
calculated.  The  pitch  detection  routines  perform  two  tasks: 
determining  the  voiced  quality  of  the  speech  (whether  the 
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frame  is  voiced  or  not) ,  and  then  the  pitch  of  the  speech  if 
the  frame  is  voiced.  The  frame  for  pitch  is  moved  at  an 
interval  (MAXPT)  which  is  set  by  the  user.  Pitch  is 

therefore  updated  every  MAXPT  samples  and  is  assumed 
constant  over  the  frame. 

One  method  of  pitch  detection  is  essentially  a 

correlation  process.  The  frame  array  is  correlated  against 
itself,  and  the  peaks  which  fall  within  an  allowable  range 
of  times  are  tested  to  find  a  maximum.  The  maximum  peak  is 
then  tested  against  a  threshold  which  is  set  by  the  user. 
This  threshold  is  the  decision  variable  STHR.  If  the 
magnitude  of  the  peak  falls  below  the  threshold,  the  speech 
is  declared  unvoiced,  otherwise  it  is  declared  voiced.  The 
pitch  is  a  function  of  the  location  of  the  peak.  Since  the 
range  of  fundamental  pitch  of  most  human  speakers  falls 
within  a  fairly  narrow  range  (70-350  Hz)  [Ref  9] ,  only  a 

narrow  range  of  peak  positions  need  to  be  considered.  Due 

to  the  harmonics  and  the  formant  structure  of  speech,  the 
pitch  period  is  a  difficult  calculation  and  is  quite  prone 
to  error.  Therefore  interpolation  is  employed  to  smooth  the 
curve  described  by  the  pitch  values. 

This  interpolation  delays  the  final  value  of  the  pitch 
by  three  frames  (see  figure  3-3).  An  estimate  of  the  pitch 
in  the  first  pitch  analysis  frame  is  computed.  The  frame 
start  is  shifted  by  the  pitch  shift  interval  (MAXPT) ,  and 
the  pitch  of  the  second  frame  is  estimated.  Similarly,  the 
pitch  of  the  third  frame  is  estimated.  These  three  values 
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are  used  to  interpolate  a  better  estimate  of  the  pitch  in 
the  first  frame.  The  pitch  in  the  following  frames  is 
determined  in  a  similar  manner.  For  instance,  the  estimate 
of  the  fourth  frame  is  used  along  with  the  estimates  of  he 
second  and  third  frame  to  interpolate  a  better  estimate  of 
the  pitch  in  the  second  frame.  This  delay  requires  that  the 
calculation  of  the  predictor  coefficients  be  equivalently 
delayed.  The  pitch  detectors  must  therefore  have  a  memory 
of  three  to  perform  the  interpolation  and  to  accomodate  the 
delay  of  the  coefficient  analysis. 

After  the  pitch  is  determined  by  interpolation,  the  LPC 
coefficients  and  the  gain  must  be  calculated.  The 
coefficient  analysis  frame  can  be  pre-processed  in  a  number 
of  different  ways.  If  desired,  the  speech  can  be 
pre-emphasized.  The  pre-emphasis  is  accomplished  wih  the 
operation 

y (n)  =  x(n)  -  .9x(n-l)  (3-1) 

The  decision  variable  which  controls  this  is  NEMP.  Also 
available  to  the  user  is  the  option  to  window  the  frame.  A 
decision  variable,  Hi,  controls  whether  or  not  a  Hamming 
window  is  used  on  the  speech.  After  pre-processing,  the  LPC 
calculations  are  straight-forward,  and  are  performed  by  one 
of  the  algorithms  presented  in  Chapter  II.  A  decision 
variable  (MP)  controls  which  algorithm  to  use,  and  therefore 
which  subroutine  to  perform.  A  decision  variable  (POLES) 
also  contols  the  number  of  poles  used  in  the  analysis.  The 
system  is  unable  to  operate  with  more  than  20  poles,  because 
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of  constraints  on  the  size  of  some  of  the  arrays.  The 
predictor  coefficients  are  computed  synchronously,  that  is 
the  start  of  the  frame  used  by  the  coeffiecient  generating 
routines  is  set  by  the  shift  introduced  by  the  pitch  period. 
This  avoids  the  problem  of  having  analysis  extend  over  two 
adjacent  frames. 

The  last  task  during  processing  of  each  frame  is  to 
write  the  relevant  parameters  to  the  channel  file.  These 
parameters  are  the  voiced  quality  of  the  speech,  the  pitch, 
the  output  analysis  frame  size,  the  predictor  coefficients, 
and  the  gain  of  the  system.  The  output  analysis  frame  size 
is  the  shift  interval  for  the  start  of  the  next  coefficient 
analysis  frame. 

Description  of  the  Synthesizer 

The  LPC  synthesizer  produces  speech  from  the  parameters 
read  from  a  channel  file.  The  first  stage  of  the 
synthesizer  program  opens  the  channel  file  and  creates  the 
file  to  which  synthesized  speech  will  be  written.  The 
program  is  initialized  by  reading  the  first  four  values  from 
the  channel  file.  These  values  are  the  number  of  poles  used 
in  the  analysis  and  the  synthesis  (POLES) ,  the  glottal  pulse 
shape  (NGLT) ,  the  unvoiced  gain  factor  (UNGA) ,  and  the  value 
of  the  flag  indicating  whether  pre-emphasis  was  employed  at 
the  input  to  the  analyzer  (NEMP).  The  unvoiced  gain  factor 
is  used  to  normalize  the  unvoiced  excitation  to  the  output 
filter.  The  vocoder  then  synthesizes  one  variable  length 
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frame  of  speech  at  a  time.  The  frames  have  varying  length 
because  of  the  synchronous  method  of  coding.  To  synthesize 
each  frame,  the  vocoder  first  reads  the  pitch  information, 
the  length  of  the  frame  to  be  produced,  and  the  gain.  It 
then  reads  the  predictor  coefficients.  The  pitch 
information  drives  the  synthesis  process. 

If  the  frame  is  voiced,  an  array  simulating  the  glottal 
pulse  drives  the  digital  output  filter.  Two  pulses  are 
generated  at  intervals  separated  by  the  pitch  period,  and 
are  written  to  an  array  of  length  twice  the  period.  This 
array,  the  predictor  coefficients,  and  the  gain  then  drive 
the  subroutine  "THROAT"  which  is  the  digital  output  filter. 
The  output  of  this  filter  is  the  synthesized  speech  and  is 
written  to  the  output  file. 

If  the  frame  is  unvoiced,  a  noise  generation  routine  is 
called.  This  double-precision  routine  uses  a  uniform  random 
number  generator  to  produce  a  normal  random  number  sequence. 
This  routine  writes  the  sequence  to  an  array.  The  gain  from 
the  input  file  is  scaled  by  the  unvoiced  gain  factor  to  give 
a  value  for  the  gain  to  drive  the  digital  output  filter. 
This  scaling  is  required  because  the  excitation  of  the 
filter  (the  noise  array)  is  not  normalized  in  energy  with 
the  excitation  of  the  filter  when  the  speech  is  voiced. 
This  array,  the  predictor  coefficients,  and  the  gain  then 
drive  the  digital  output  filter.  The  output  of  this  filter 
is  the  synthesized  speech,  which  is  written  to  the  output 
f  ile . 


III-ll 


If  the  frame  is  silence,  the  output  filter  is  by-passed 
and  zeros  are  written  directly  to  the  output  file. 

If  pre-emphasis  was  used  by  the  analyzer,  then 
de-emphasis  must  be  employed  before  the  speech  is  written  to 
the  output  file.  The  inverse  function  of  the  pre-emphasis 
is  used  for  this 

y (n)  =  x(n)  +  .9y(n-l)  (3-2) 

The  speech  is  finally  scaled  so  that  it  can  be  listened  to 
by  using  the  "AUDIOHIST"  or  "AUDIOMOD"  programs. 

Synchronous  Analysis 

The  LPC  system  developed  in  this  thesis  uses  a 
synchronous  method  of  analysis.  Synchronous  analysis 
requires  an  update  of  the  predictor  coefficients  once  every 
pitch  period  (see  figure  3-4).  Because  of  the  delay  in 
determining  the  correct  pitch,  the  predictor  coefficients 
are  generated  with  a  two  frame  delay  with  respect  to  the 
pitch  detection.  The  shift  interval  of  the  coefficient 
analysis  is  a  multiple  of  the  pitch  period  (P^  in  figure 
3-3).  When  the  start  of  an  analysis  frame  falls  after  the 
start  of  the  next  pitch  detection  frame,  the  value  of  this 
pitch  (P2)  is  used  as  the  shift  interval.  The  frame 
start  is  then  shifted  as  before  until  a  new  pitch  value  is 
needed.  At  this  point,  a  new  frame  is  estimated  for  pitch. 

In  a  synchronous  system,  analysis  on  each  voiced 
segment  of  speech  begins  at  the  beginning  of  the  pitch 
period  and  the  analysis  frame  is  shifted  at  an  interval 
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which  is  a  multiple  of  the  pitch  period  (see  figure  3-3). 
The  vocoder  only  synthesizes  the  speech  between  the  start  of 
every  analysis  frame.  For  this  reason  is  the  analysis  frame 
size  written  onto  the  channel  file.  Synchronous  analysis 
avoids  the  problem  of  having  a  pitch  period  of  voiced  speech 
straddling  the  boundary  between  two  consecutive  frames. 
When  analysis  straddles  two  frames,  the  coefficients  must  be 
interpolated  for  the  portion  of  speech  reproduced  over  the 
boundary.  The  interpolated  values  of  these  coefficients  are 
not  guaranteed  to  give  stable  results. 

During  unvoiced  speech,  the  shift  interval  is  constant. 
A  constant  frame  rate  (MAXFR)  is  used  for  shifting  the 
coefficient  analysis. 

The  three  frame  delay  of  the  pitch  detector  requires  a 
delay  of  the  coefficient  analysis  by  a  corresponding  time 
period.  Therefore  the  frame  starts  and  boundaries  of  the 
pitch  analysis  frame  and  the  coefficient  analysis  frame  are 
different.  This  is  the  reason  for  the  memory  of  the  pitch 
and  energy  detectors  and  the  initial  large  array.  With  this 
array,  the  current  pitch  can  be  calculated  with  one  portion 
of  the  data,  and  used  for  interpolation  of  past  pitch.  At 
the  same  time,  the  delayed  pitch  can  be  used  to  determine 
the  correct  starting  locat1'-'  in  the  past  for  predictor 
coefficient  generation.  Therefore  the  system  can  perform 
coefficient  calculation  in  step  with  pitch  detection.  This 
does  tend  to  limit  flexibility  because  it  does  not  allow 
asynchronous  analysis. 
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Some  complexity  is  added  because  of  the  synchronous 
nature  of  the  analysis.  For  example,  the  bookkeeping  for 
each  frame  is  doubly  complex;  two  sets  of  counters  must  be 
maintained.  It  is  also  confusing  that  the  pitch  detection 
and  predictor  coefficient  generating  routines  do  not  work  on 
the  same  data  simultaneously.  This  requires  that  the 
predictor  coefficient  generating  routines  operate  after  the 
pitch  has  been  calculated  for  the  final  frame.  Substantial 
gains  are  realized,  though,  because  interpolation  of 
predictor  coefficients  need  not  be  performed.  Counters 
would  also  have  to  be  maintained  in  vocoder  to  mark  the 
beginning  and  ending  of  boundary-overlapping  sections  of 
speech. 

Time  Constraints 

Unfortunately,  the  system  does  not  run  in  real-time.  A 
number  of  factors  affect  the  speed  of  the  system,  among 
these:  the  software  implementation,  the  audio  interface,  and 
the  speed  of  the  computer. 

The  biggest  constraint  on  time  is  the  software  in  the 
system.  Since  the  software  is  written  in  a  high  level 
language,  its  speed  is  limited  by  the  constraints  of  the 
language.  Some  of  these  constraints  are  the  time  necessary 
to  write  to  a  file  and  the  operation  rate  of  the 
minicomputer.  A  hardware  implementation  would  not  be  under 
these  constraints.  The  software  is  also  a  bit  cumbersome  in 
that  it  must  be  flexible  enough  to  operate  properly  with 
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many  different  possible  prediction  schemes.  For  use  in  the 
laboratory,  the  transmitter  and  the  receiver  programs  must 
be  run  sequentially,  whereas  they  would  operate  in  parallel 
if  in  a  normal  communication  configuration. 

Another  time  constraint  is  that  the  audio  interface  in 
the  lab  is  not  prepared  for  real  time  events.  The  program 
which  performs  the  digital  to  analog  calculations  and 
channel  calling  routines  required  to  listen  to  the 

synthesized  speech  must  read  a  file  first.  The  size  of  this 
file  is  limited  by  the  program  to  less  than  three  seconds. 
Therefore,  long  utterances  are  impossible  to  process  and 
listen  to  without  interuption  and  user  interaction  with  the 
system. 

Summary 

The  two-program  nature  of  the  LPC  system  is  used  to 
imitate  the  tranmitter  and  receiver  nature  of  a  true 

communication  link.  These  two  programs  are  the  LPC  analyzer 
and  the  LPC  synthesizer.  The  channel  between  them  is  a  file 
written  in  the  memory  of  the  computer.  Subroutines  are  used 
extensively  to  permit  easy  examination  of  internal  results 
and  provide  flexibility  to  run  or  add  different  subroutines 
which  perform  the  same  analysis  in  different  ways. 

Synchronous  analysis  is  employed  to  simplify  synthesis  of 
the  output  speech. 
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CHAPTER  IV 


I 


Testing  and  Results 


The  System 

This  thesis  designed  an  LPC  speech  processing  system 
which  operates  in  the  Signal  Processing  Laboratory  at  the 
Air  Force  Institute  of  Technology.  It  is  an  operational 
model  and  replicates  a  number  of  aspects  of  a  true  LPC 
speech  communication  system.  Unfortunately,  this  system 
does  not  run  in  real  time,  as  it  is  written  entirely  in 
software  and  must  access  and  write  files  which  reside  in  the 
minicomputer's  memory.  A  real  LPC  processor  and  synthesizer 
would  receive  and  transmit  signals  over  a  communication 
channel,  and  except  for  buffering,  would  require  no  reading 
or  writing  from  files.  Most  of  these  operations  would  be 
handled  by  more  time-efficient  hardware. 

A  simple  test  of  the  system  consists  of  actually 
processing  an  utterance.  Each  utterance  is  a  digitized 
version  of  a  sentence  spoken  by  a  human.  These  utterances 
or  speech  files  are  relatively  noise-free  so  that  the  noise 
problem  of  LPC  would  not  need  to  be  considered.  These 
utterances  were  successfully  processed  by  the  system  to  give 
intelligible  results. 

To  demonstrate  the  flexibility  of  the  system,  various 
combinations  of  shift  intervals,  analysis  window  size. 


prediction  methods,  pitch  detection  algorithms,  and 


threshold  values  were  tested.  The  results  of  these 
combinations  indicate  that  the  system  is  flexible.  Table 
4-1  shows  the  allowable  ranges  of  the  decision  variables. 
The  recommended  values  on  this  table  indicate  values  for  the 
decision  variables  which  seemed  the  give  the  best  results. 
Other  testing  showed  that  with  only  a  very  few  exceptions, 
the  system  could  handle  all  of  the  combinations  with  which 
it  was  tested.  The  exceptions  are  noted  below. 

On  occasion,  especially  when  a  low  pitched  voice  was 
processed,  the  vocoded  speech  was  so  unintelligible  that  it 
was  impossible  to  determine  that  an  utterance  was  present. 
This  problem  arises  from  the  nature  of  the  synchronous 
analysis.  Synchronous  analysis  demands  that  two  shift 
intervals  be  maintained,  one  for  the  pitch  analysis  window 
and  one  for  the  coefficient  analysis  window.  During  voiced 
speech,  the  shift  interval  for  the  coefficient  analysis 
window  is  based  on  the  pitch  period,  but  the  bookkeeping 
counters  are  based  on  the  shift  interval  of  the  pitch 
detector.  If  the  shift  interval  is  less  than  the  length  of 
the  longest  pitch  period  during  voiced  speech,  the  analysis 
window  used  to  calculate  the  prediction  coefficient  is 
incorrectly  bounded.  Results  in  this  case  are  consistently 
poor,  and  result  in  unstable  filters  which  produce 

unintelligible  clicks,  buzzes,  and  squeals.  To  eliminate 
this  problem,  the  frame  shift  interval  must  be  increased  to 
accomodate  the  lowest  pitch  frequency  of  the  file.  For  the 
file  which  contained  the  lowest  pitch  this  interval  is 
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Table  4-1 


Ranges  of  the 

Decision  Variables 

Decision 

Lower 

Upper 

Recommended 

Var iable 

Limit 

Limit 

Value 

POLES 

6 

20 

16 

MP  a 

- 

- 

1 

MAXFR 

100 

400 

200 

MAXPT  b 

80 

200 

100 

NEMP  a 

- 

- 

1 

NGLT  a 

- 

- 

3 

HI  a 

- 

- 

1 

MPCH  a 

- 

- 

0 

NPCS  a 

- 

- 

1 

STHR 

0.0 

1.0 

0.35 

SCAF 

1.0 

1000.0 

1.0 

THRESH 

0.0 

1000000.0 

250.0 

UNGA 

0.001 

100.0 

0.1 

a)  These  variables  are  flags  which  determine  whether 
subroutines  will  be  performed  or  not.  They  do  not  have 
upper  or  lower  limits  over  a  range. 

b)  it  is  recommended  that  the  value  of  MAXPT  be  half  the 
value  of  MAXFR.  This  gives  a  2:1  overlap. 


* 

I 


between  80  and  100  samples. 

Two  methods  of  predictor  coefficient  generation  are 
included  in  the  system  and  were  examined  extensively.  Both 
give  acceptable  results  and  produce  intelligible  speech  at 
the  output.  Although  theory  indicates  that  the  covariance 
method  of  prediction  need  not  be  windowed,  much  better 


results 

are 

attainable 

when  the  incoming  speech  signal 

is 

weighted 

by 

a  Hamming 

window. 

In  this  context,  better 

results 

mean 

higher  quality. 

Without 

the  windowing. 

the 

method 

often 

produces 

unstable 

output 

filters,  with 

the 

windowing,  the  resultant  speech  exceeded  the  quality  of 
spech  produced  with  the  autocorrelation  method. 

The  Sift  routine  [Ref  9,10]  for  pitch  detection  was  the 
only  satisfactory  pitch  detector  implemented.  The  Sift 
algorithm  uses  an  inverse  filtering  technique  to  cancel  the 
effect  of  the  formant  structure.  In  a  non-noise 
environment,  this  detector  can  consistently  differentiate 
between  voiced  and  unvoiced  speech  It  can  also  successfully 
determine  the  pitch  to  produce  natural  sounding  speech.  An 
autocorrelation  method  was  also  examined  but  the  algorithm 
did  not  give  consistent  results.  The  calculated  pitch  was 
monotone  except  for  a  disconcerting  waver. 

The  number  of  poles  in  the  analysis  was  also  varied. 
Ten  poles  marked  a  qualitative  boundary  between  clear  speech 
and  muffled  speech.  Using  too  few  poles  gave  vocoded  speech 
which  was  severely  muffled.  With  such  a  small  number  of 


poles  the  filter  does  not  have  the  resolution  required  to 


describe  the  complete  formant  structure  of  the  modeled  vocal 
tract  (see  figures  4-1  to  4-5).  With  less  than  six  poles, 
the  speech  becomes  unintelligible.  With  more  than  ten 
poles,  the  quality  of  the  resultant  speech  increases  with 
the  addition  of  poles,  with  maximum  quality  reached  at  about 
sixteen  poles.  With  sixteen  or  more  poles,  the  quality 
remains  approximately  the  same. 

Noise 

The  noise  was  introduced  to  each  utterance  by  adding  a 
random  noise  signal  to  the  file  containing  the  utterance. 
This  was  accomplished  with  the  same  random  noise  generating 
subroutine  which  is  used  to  excite  the  vocoder  during 
unvoiced  speech.  A  separate  program  performs  this  noise 
addition.  The  maximum  value  of  the  noise  can  be  varied  to 
provide  noise  levels  from  zero  to  considerably  in  excess  of 


the  speech 

power.  Because 

speech 

is 

not  a  stationary 

process,  a 

true  signal  to 

noise 

level 

is  difficult  to 

calculate . 

It  varies  widely 

if  the 

noise 

level  is  constant 

because  the  speech  energy  varies  widely  from  low  energy  in 
the  fricatives  (s, sh, f , th. . . )  to  much  higher  energy  in  the 
voiced  sounds  such  as  vowels.  For  our  concerns,  a  signal  to 
noise  ratio  (SNR)  was  determined  by  calculating  the  power  of 
the  noise  and  comparing  it  to  the  power  of  the  voiced 
portion  of  a  clean  file.  Although  the  true  SNR  may  vary 
from  frame  to  frame  and  from  speech  file  to  speech  file, 
consistent  results  are  possible.  That  is,  equivalent 
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values  of  added  noise  gave  approximately  the  sa~e  SNR  to  all 
cases  (see  table  4-2)  and  the  synthesized  speech  suffered 
the  same  problems  of  unintelligibilty . 


Table  4-2 
Noise 

Maximum  Value  of  Noise  Power  in  Signal  to  Noise 

Added  Noise  the  Frame  (dB)  Ratio-SNR  (dB) 


0 

58 

20 

100 

61 

16 

200 

64 

14 

500 

70 

8 

1000 

73 

4 

To  test  the  effects  of  noise,  speech  files  with 
different  signal  to  noise  ratios  were  tested.  Degradation 
occured  even  with  fairly  small  amount  of  added  noise  (SNR  = 
1 6dB) .  Severe  degradation  of  the  re-synthesized  speech 
occured  with  a  SNR  of  about  8dB  for  the  noisy  input  signal. 
At  this  level,  the  output  was  unintelligible. 

The  analyzer  consists  of  two  parts,  a  pitch  detector 
and  a  prediction  coefficient  generating  routine.  These  two 
parts  were  examined  to  determine  which  is  the  more  sensitive 
to  noise  corruption.  This  was  accomplished  by  processing 
two  versions  of  the  same  utterance,  one  which  was  an 
unaltered  version  of  the  original  speech  file,  and  one  which 
was  the  original  file  plus  an  amount  of  random  noise  added. 
These  files  were  processed  separately  wihin  the  same 
program,  one  for  pitch  and  the  other  for  the  prediction 
coefficients.  Preliminary  results  indicate  that  the 
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predictor  coefficients  are  more  susceptible  to  noise  than  is 
the  pitch  detection. 

With  this  processing  scheme,  four  permutations  of  noisy 
and  clean  files  are  possible.  Two  permutations  use  the  same 
file  for  pitch  detection  and  coefficient  generation.  One 
performs  pitch  detection  on  a  clean  file  and  coefficient 
generation  on  a  noisy  file,  and  one  performs  pitch  detection 
on  a  noisy  file  and  coefficient  generaton  on  a  clean  file. 

The  clean/clean  test  was  used  as  a  control  example  and 
the  other  permutations  were  examined  for  a  qualitative 
analysis  of  intelligibility.  The  signal  to  noise  ratio  was 
maintained  constant  over  all  of  the  files.  Noisy  speech 
files  with  a  SNR  of  8dB  gave  unintelligible  results  but  were 
identifiable  as  speech,  so  this  level  was  used  as  the  noisy 
file.  The  mixed  analysis  (noisy/clean,  clean/noisy)  gave 
interesting  results.  The  noisy  coefficient/clean  pitch  gave 
highly  degraded  speech  which  was  only  slightly  better  than 
the  noisy/nolisy  analysis.  Contrasting  with  this,  the  clean 
coefficient/noisy  pitch  gave  only  slightly  degraded  results. 
Three  of  the  five  tests  were  noticably  degraded,  yet  still 
intelligible.  Two  were  virtually  indistiguishable  from  the 
clean/clean  example.  It  was  expected  that  the  noise  would 
degrade  the  pitch  detection  severely  and  render  any 
reproducton  unintelligible.  In  the  five  tests,  this  was  not 
the  case:  the  predictor  coefficient  generating  algorithm 
failed.  This  can  be  heard  in  the  output  speech  and  seen  in 
a  plot  of  the  frequency  response  of  the  ensemble  of  the 
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Figure  4-10  Formant  Trajectory  of  the 
noise.  Added  noise  gives  SNR  of  8  dB . 


utterance  "five 


Figure  4-11  Formant  Trajectory  of  the  utterance  "fiv 
noise.  Added  noise  gives  SNR  of  4  dB. 


The  formant 


digital  filters  (figures  4-6  to  4-11). 
structure  is  lost  in  the  noise;  only  the  first  formant  can 
be  located  and  identified  on  these  plots. 

To  further  examine  this  phenomenon,  the  glottal  pulse 
excitation  of  the  outp  .  filter  was  replaced  by  the  random 
noise  excitation.  The  results  in  this  case  sounded  like  a 
whispered  utterance,  but  were  still  intelligible.  In  fact, 
preliminary  results  indicate  that  in  a  high  noise 

environment,  intelligibility  can  be  gained  by  neglecting 
pitch  information  at  the  output  and  generate  the  utterance 
with  only  only  a  random  excitation  to  the  output  filter. 
The  waveshape  of  the  synthesized  speech  was  examined  to 
possibly  locate  the  source  of  the  unin telligibilty .  It  was 
determined  that  the  voiced  frames  in  the  synthesized  speech 
are  the  main  cause  of  the  squeals  and  buzzes  which  make  the 
speech  unintelligible. 
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CHAPTER  V 


Conclusions  and  Recommendations 


Conclusions 

This  report  describes  a  system  which  processes  speech 
using  linear  predictive  methods.  The  system  is  a  software 
simulation  of  an  LPC  analyzer  and  synthesizer.  The  system 
consists  of  two  programs,  one  of  which  processes  the  speech 
to  generate  the  LPC  parameters,  and  another  which  processes 
these  parameters  to  resynthesize  the  speech.  An  important 
aspect  of  the  system  is  that  it  enables  the  user  to  select 
from  various  pitch  and  coefficient  analysis  methods.  It 
also  allows  the  user  to  vary  other  parameters  in  order  to 
simulate  other  changes  in  the  processing  scheme. 

To  test  the  operation  of  the  system,  a  regimen  of 
testing  was  performed  by  varying  the  different  parameters. 

A  separate  program  allows  a  simple  method  for  changing  all 
of  the  parameters  over  which  the  user  has  control.  These 
parameters  are  called  the  decision  variables  and  each  has  an 
allowable  range  of  values.  The  system  operated 
satisfactorily  over  all  values  of  the  decision  variables. 
The  flexibilty  exhibited  by  the  system  in  this  testing 
indicates  that  the  system  can  be  a  valuable  tool  for  the 
study  of  linear  predictive  coding  of  speech  in  the  Signal 
Processing  Lab  at  AFIT. 

Some  of  the  parameters  which  were  tested  extensively 
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were  the  number  of  poles  in  the  analysis,  the  different 
methods  of  analysis  and  pitch  detection.  It  was  determined 
that  ten  poles  give  a  reasonable  representation  af  speech. 
The  covariance  method  of  detection  exceeded  the 

autocorrelation  method  with  respect  to  quality  of  output 
speech.  The  SIFT  pitch  detection  routine  far  exceeded  the 
AUTOC  method  in  determining  pitch. 

Also  examined  were  some  of  the  noise  problems  of  LPC. 
Various  noise  levels  were  tested  to  determine  at  which  level 
noise  corruption  rendered  the  LPC  system  useless.  This 

level  was  found  to  be  at  a  signal  to  noise  ratio  of  about 
8dB.  Another  important  result  was  that  the  coefficient 
generation  was  greatly  affected  by  noise.  The  effect  of  the 
predictor  coefficients  was  much  greater  than  the  effect  of 
the  pitch  detection.  This  result  may  be  useful  in  exploring 
techniques  to  counter  the  effects  of  noise  corruption. 

Recommendat ions 

The  linear  predictive  coding  system  presented  in  this 
thesis  can  be  used  as  a  firm  foundation  for  more  study  in 

the  process  of  linear  predictive  coding  of  speech. 

Continuing  effort  with  this  project  could  extend  in  two 
general  directions.  One  direction  would  be  software 

oriented  with  further  work  being  done  to  expand  the  system 
with  more  subroutines.  The  other  general  direction  is 
oriented  to  studying  more  about  LPC  using  the  system  as  a 
tool. 
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Further  Work  in  Software  Development 

Part  of  the  flexibilty  of  the  system  stems  from  the 
extensive  use  of  subroutines.  Additional  subroutines  could 
be  incorporated  into  the  system  to  expand  the  present 
capabilities  of  the  LPC  anlyzer.  Perhaps  the  first  task  to 
be  attempted  would  be  to  incorporate  the  recursive  LPC 
method  as  developed  by  Capt  Willis  Janssen  [Ref  4]  into  this 
system.  This  would  offer  an  opportunity  to  compare  this 
method  with  some  of  the  more  common  techniques  which  have 
been  implemented  in  the  current  system.  A  lattice 
formulation  of  the  predictor  coefficients  would  offer 
another  method  of  analysis.  This  method  is  decribed  in  the 
book  by  Rabiner  and  Schafer  [Ref  10].  An  undebugged  version 
of  a  possible  subroutine  implementation  is  presented  in 
Appendix  D. 

Other  useful  additions  to  the  present  system  would  be 
additional  pitch  detection  methods.  The  AUTOC  method  might 
be  altered  slightly  to  give  better  results.  The  current 
literature  describes  other  methods  of  pitch  detection.  The 
system  would  be  greatly  enhanced  if  it  offered  the 
availablity  of  more  methods  with  which  to  analyze  the  input 
speech  signal. 

A  process  for  simulating  the  bit  rates  actually 
transmited  over  the  channel  would  be  a  useful  addition.  At 
present,  all  quanization  is  done  in  2  byte  {16  bit) 
segments.  The  coefficients  sent  over  the  channel  are  4  byte 
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floating  point  words.  The  flags  sent  over  the  channel 
require  at  most  2  bits  each,  but  each  is  quantized  as  16  bit 
integers.  The  pitch  is  likewise  represented  as  a  16  bit 
integer.  It  could  easily  be  represented  with  fewer  bits. 
The  frame  size  information  sent  over  the  channel  is 
redundant  and  could  be  eliminated.  All  of  these 

compressions  could  reduce  the  effective  bit  rate  of  the 
communicated  signal. 

Also  needed  is  a  better  interface  with  the  audio  input 
and  output.  The  present  means  to  listen  to  processed  speech 
is  to  move  the  file  containing  the  speech  to  another 
directory  (on  a  different  system  even)  and  invoke  another 
program.  This  method  is  time  consuming  and  reduces  the 
effectiveness  of  a  synthesize-listen-compare  atmosphere  of 
testing . 

Further  Testing 

Since  this  simulation  was  designed  as  a  tool  to  study 
the  process  of  linear  predictive  coding  of  speech,  it  seems 
only  natural  that  considerable  further  testing  can  be 
imagined.  A  fine  place  to  start  further  research  is  with 
the  ubiquitous  noise  problem. 

A  possible  technique  for  reducing  the  effect  of  noise 
was  discovered  in  this  work.  The  technique  is  to  ignore  the 
pitch  detector  information.  If  all  speech  is  presumed  to  be 
unvoiced,  the  synthesized  speech  will  resemble  whispered 
speech.  In  noise,  the  greatest  difference  between  the  input 
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speech  and  the  output  speech  occured  during  periods  of  the 
utterance  declared  to  be  voiced  speech.  I  recommend  this 
technique  to  be  further  explored. 

Dr  Kabrisky  is  interested  in  a  method  of  compressing 
the  formant  frequencies  while  maintaining  the  ratio  between 
them.  This  system  models  the  human  speech  production  system 
with  the  poles  of  a  digital  filter.  These  poles  describe 
the  formant  locations.  Therefore  digital  processing 
techniques  could  be  used  to  shift  the  poles  and  consequently 
the  formants. 

One  final  recommendaton  is  to  use  this  system  as  a 
means  to  study  the  speech  recognition  capabilities  of  LPC. 
Other  research  has  shown  the  feasibilty  of  LPC  for  speech 
recognition  tasks  [Ref  2],  The  feature  vector  as  described 
by  the  predictor  coefficients  is  easily  extractable  from 
this  system.  The  flexibilty  of  the  system  offers  the  user 
to  vary  a  wide  range  of  parameters  in  search  for  a  set  which 
expedites  recognition. 


V-5 


1 


Bibliography 


Atal,  B.  S.  and  Remde,  J.  R.  "A  Model  of  LPC  Excitation 
for  Producing  Natural-Sounding  Speech  at  Low  Bit 
Rates,"  Proc.  IEEE  Conf.  on  Accoustics,  Speech,  and 
Signal  Processing  ,  pp.  614-617,  1982 

2.  Doddington,  G.  R.  and  Schalk,  T.  B.  "Speech  Recognition: 
Turning  Theory  to  Practice,"  IEEE  Spectrum  ,  pp. 
26-32,  Sep  1981 

3.  Hunter,  C.  J.  Time  Axis  Analysis  of  Gravity  Distorted 

Speech  .  MS  Thesis  GE/EE/81D-* .  Wright  Patterson  AFB, 
Ohio:  School  of  Engineering,  Air  Force  Institute  of 

Technology,  Dec  1981. 

4.  Janssen  W.  A.  A  Recursive  Linear  Predictive  Vocoder  . 

MS  Thesis  GE/EE/£3D-33 .  Wr ight  Patterson  AFB,  Ohio: 
School  of  Engineering,  Air  Force  Institute  of 

Technology,  Dec  1983. 

5.  Kelton,  W.  D.  and  Law,  A.  M.  Simulation  Modeling  and 

Analysis  ,  New  York,  McGraw-Hill,  19&2 

6.  Kinderman,  A.  J.  and  Ramage,  J.  G.  "Computer  Generation 

of  Normal  Random  Variables,"  Journal  of  American 

Statistical  Association  ,  vol.  71,  Dec  1976 

7.  Makhoul,  J.  "Linear  Prediction  —  A  Tutorial  Review," 
Proc.  of  the  IEEE  ,  vol.  63,  pp.  561-580,  April  1975 

8.  Makhoul,  J.  "Stable  and  Efficient  Lattice  Methods  for 
Linear  Prediction,"  IEEE  Trans.  Acoust.,  Speech, 
Signal  Processing  ,  vol.  ASSP-25,  pp.  423-428,  Oct. 
1977 

9.  Markel,  J.  D.  "The  SIFT  Algorithm  for  Fundamental 

Frequency  Estimation,"  IEEE  Trans.  Audio  and 
Electroacoustics  ,  vol.  AU-20,  no.  5,  Dec.  1972 

10.  Markel,  J.  D.  and  Gray,  A.  H.  Linear  Prediction  of 

Speech  ,  New  York:  Spr inger-Ver lag ,  T5T? 

11.  Rabiner,  L.  R.  and  Schafer,  R.  W.  Digital  Processing 
of  Speech  Signals  ,  Englewood  Cliffs,  NJ:  Preritice- 
Hall,  1978 

12.  Rabiner,  L.  R.  and  Schafer,  R.  W.  "Digital 

Representations  of  Speech  Signals,"  Proceedings  of  the 
IEEE  ,  vol.  63,  pp.  662-677,  Apr  1975 


BIB-1 


Rosenburg,  A.  E.  "Effects  of  Glottal  Pulse  Shape  on  the 
Quality  of  Natural  Vowels,"  J.  Acoust.  Soc.  Am.  ,  vol. 
49,  pp  583-590,  Feb  1971 


BIB-2 


APPENDIX  A 


Usee ' s  Manual 


A-l 


USER'S  MANUAL 


A  LINEAR  PREDICTIVE  CODING  SYSTEM 

DESIGNED  AND  WRITTEN  BY 
LT  CRAIG  E.  MCKOWN 


THIS  USER'S  MANUAL  IS  COMPOSED  OF  THREE  PARTS,  EACH 
CORRESPONDING  TO  A  SEPERATE  PROGRAM  WHICH  IS  REQUIRED  TO 
OPERATE  THE  COMPLETE  SYSTEM.  THE  FIRST  PART  DESCRIBES  THE  USE 
OF  THE  PROGRAM  SETUP.  WHICH  IS  USED  TO  CREATE  THE  DECISION 
VARIABLE  FILES.  THE  SECOND  PART  DESCRIBES  THE  USE  OF  THE 
PROGRAM  PREDICT,  WHICH  IS  THE  LPC  ANALYZER.  THE  LAST  PART 
DESRIBES  THE  USE  OF  THE  PROGRAM  VOCODE,  WHICH  SYNTHESIZES  THE 
VOCODED  SPEECH. 

BEFCRE  PREDICT  OR  VOCODE  CAN  BE  RUN.  A  DECISION  VARIABLE 
FILE  MUST  EXIST.  TO  CREATE  A  NEW  DECISION  VARIABLE  FILE  OR 
TO  UPDATE  AN  OLD  ONE,  THE  PROGRAM  SETUP  MUST  BE  USED.  THE 
OUTPUT  OF  PREDICT  IS  THE  INPUT  TO  VOCODE,  SO  PREDICT  MUST  BE 
RUN  BEFORE  VOCODE. 

THE  SPEECH  INPUT  TO  PREDICT  MUST  BE  IN  A  CONTIGUOUS  FILE, 

IN  INTEGER  FORM.  PREDICT  AND  VOCODE  HAVE  NO  LIMITATON  ON  THE 
LENGTH  OF  THE  SPEECH  FILE,  BUT  THE  AUDIO  INTERFACE  DOES.  IT 
LIMITED  TO  SB  BLOCKS  <2.8  SECONDS).  THEREFORE  IT  IS  RECOMMENDED 
THAT  THE  PROCESSED  SPEECH  BE  LIMITED  TO  88  BLOCKS. 
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PROGRAM  SETUP 


FILE: 

DIRECTORY: 

LANGUAGE: 

DATE: 

AUTHOR: 

SUBJECT: 


SETUP.  FR 
DP4: KOWN 
FORTRAN5 
SEP  83 

W.  JANSSEN  /  REVISED  BY  CRAIG  MCKOWN 
CREATES  FILE  OF  DECISION  VARIABLES  NEEDED  BY 
THE  MCKOWN  LPC  ANALYZER. 


ARGUMENTS  TYPE 

S<  VARIABLES 


PURPOSE 


RELVAR 
I NT VAR 
SIZER 
SIZEI 
OUTFILE 


REAL  ARRAY 
INTEGER  ARRAY 
INTEGER 
INTEGER 
STRING 


REAL  VALUED  DECISION  VARIABLES 
INTEGER  VALUED  DECISION  VARIABLES 
NUMBER  OF  ELEMENTS  IN  RELVAR 
NUMBER  OF  ELEMENTS  IN  INTVAR 
NAME  OF  DECISION  VARIABLE  FILE 


FUNCTION: 

THIS  PROGRAM  CREATES  A  FILE  CONTAINING  THE  DECISION 
VARIABLES  < DV )  REQUIRED  BY  THE  LPC  ANALYZER  DESIGNED  BY  C. 
MCKOWN.  IT  CAN  CREATE  A  NEW  FILE  OR  OVERWRITE  AN  OLD  FILE 
THE  PROGRAM  WILL  PROMPT  THE  USER  FOR  ALL  NECESSARY  INPUTS. 
THE  CURRENT  VALUE  OF  EACH  DV  WILL  BE  SHOWN  AND  THE  USER 
WILL  BE  GIVEN  AN  OPTION  OF  CHANGING  EACH  ONE. 

THE  PROGRAM  WILL  ALSO  PRINT  OUT  THE  DECISION  VARIABLES 
IN  A  READABLE  FORMAT  TO  THE  SCREEN  OR  THE  PRINTER  OR 
BOTH,  AS  DESIRED  BY  THE  USER. 


PROGRAM  USE: 

THE  PROGRAM  IS  LOADED  BY  THE  FOLLOWING  COMMAND: 

RLDR  SETUP  @FLIB@ 

RUN  THE  PROGRAM— 

" SETUP" 

THE  FIRST  PROMPT  WILL  ASK  IF  YOU  ARE  UPDATING  AN  OLD 
FILE.  ANSWER  YES  < " 1 "  )  IF  DV  FILE  CREATED  PREVIOUSLY. 
THE  NEXT  PROMPT  WILL  ASK  FOR  THE  FILE  NAME;  RESPOND 
WITH  "FILENAME''  OF  THE  FILE  YOU  WISH  TO  PREPARE.  THE 
OLD  FILE  WILL  BE  OVER-WRITTEN  BY  ANY  CHANGES  MADE. 

THE  REST  OF  THE  PROGRAM  IS  EXPLAINED  BY  THE  PROMPTS. 

SEE  USER'S  MANUAL  FOR  PROGRAM  "PREDICT"  FOR  A  LIST  OF 
THE  NAMES  OF  THE  VARIALBES  TO  BE  CHANGED  OR  SET. 


SUBROUTINES  REQUIRED: 
NONE 


CHANGES: 

ADDING  NEW  DECISION  VARIABLES  IS  NOT  DIFFICULT. 

ADDITIONAL  SPACE  REMAINS  IN  EACH  DV  ARRAY  FOR  AT  LEAST 

FIVE  MORE  VARIABLES.  THE  PROGRAM  MUST  BE  UP-DATED  IN 

FOUR  PLACES  FOR  EACH  ADDITIONAL  VARIABLE. 

1 )  IN  THE  ***  UPDATE  ARRAYS  ***  SECTION.  FOLLOW 
THE  FORMAT  DF  THE  OTHER  VARIABLE  UPDATES.  YOU 
MUST  CHANGE  THE  LINE  NUMBER  AFTER  THE  ,,IF(>GOTO,, 
IN  THE  UPDATE  PRECEDING  THE  ADDITIONAL  UPDATE. 

2)  IN  THE  ***  TYPE  ARRAYS  ***  SECTION.  FOLLOW  THE 
FORMAT  OF  THE  OTHER  TYPE  STATEMENTS. 

3&4)  IN  THE  ***  OUTPUT  FILE  ***  SECTION.  A  NEW  WRITE 
STATEMENT  AND  FORMAT  STATEMENT  MUST  BE  ADDED  FOR 
EACH  NEW  VARIABLE  FOLLOW  THE  FORMAT  OF  THE  OTHER 
WRITE  AND  FORMAT  STATEMENTS. 


EXAMPLE: 
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SETUP 

THIS  PROGRAM  CREATES  OR  UPDATES  A  DECISION  UARIABLE  FILE. 
ARE  YOU  UPOATING  AN  OLD  FILE? 

< 1 -YES, 0-NO )1 
FILE  NAME?  DECUARR 

IF  YOU  CHOOSE  TO  CHANGE  A  UARIABLE  ENTER  *  Y 
OTHERWISE  ENTER  ANOTHER  LETTER 


CURRENT  UAL UE  OF  ACCEPT /NOT  ACCEPT  (A-0.NA-1 )=  1 

CHANGE  UALUE? 

CURRENT  NUMBER  OF  POLES  IS  s  10 

CHANGE  UALUE? 


THE  METHOO  OF  PREDICTION  IS 
<  AUTO-0,  COUAR-1 > :  0 

CHANGE  UALUE? 

CURRENT  UALUE : NO .  OF  POINTS/SET  <  MAXFR ) :  200 

CHANGE  UALUE? 

Y 

INPUT  NEW  UALUE:  160 


THE  CURRENT  UALUE  OF  FILTER  SPACINGS  IS  <MAXPT>=  100 

CHANGE  UALUE? 

Y 

INPUT  NEW  UALUE*  80 

0F  PRE/DE-EMP  C  1-Y/0-N)  IS:  i 

CHANGE  UALUE? 


THE  CURRENT  UALUE  OF  GLOTTAL  SHAPE  IS 
< 1 “POLYNOMIAL/ 3- IMPULSE)  *  3 
CHANGE  UALUE? 


THE  CURRENT  UALUE  OF 
CHANGE  UALUE? 


HAMMING  WINDOW  <  0-NO/  1-YES  >•• 


THE  METHOD  OF  PITCH  DETECTION  IS 
< SIFT-0, AUTOC-l):  0 

CHANGE  UALUE? 


PITCH  DETH  ANO  COEF.  CAL'N  FROM  SAME  FILE? 
CURRENT  UALUE  <1-Y,0-N>*  1 

CHANGE  UALUE? 
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THE  CURRENT  UALUE  OF  UO ICED/UN  THRESH  IS*  .400800 

CHANGE  UALUE? 

Y 

INPUT  NEW  REAL  UALUE*  .35 

CURRENT  UALUE  OF  SPEECH  SCALE-< IN  CODER)*  1.00000 

CHANGE  UALUE? 


CURRENT  UALUE  OF  SILENCE  THR£SH-< IN  ENER)IS>  350.000 

CHANGE  UALUE? 

CURRENT  UALUE  OF  UNUOICED  GAIN  FACTOR  IS*  .100000 

CHANGE  UALUE? 


THE  ARRAYS  HAUE  BEEN  LOADED  . 

DO  YOU  WANT  TO  HAUE  THE  ARRAY  TYPEDC 1 -YES, 0-NO)* 
ACCEPT/NOT  ACCEPT  *  1 

NUMBER  OF  POLES*  10 

METHOO  < 0-AUTO, 1-COUAR, )*  0 

MAXFR*  160 

HAXPT*  80 

PRE/DE-EMP  <1-Y,0-N)*  1 

GLOT  < 1-POLYNOMIAL, 3-IMPULSE)*  3 

HAMMING  WINDOW?  <1-Y,0-N)*  1 

METHOO  PITCH  DET  <  0-SIFT >  1-AUTOC ) *  0 

PITCH  l  COEF'S  SAME  FILE< 1-Y/0-N)*  1 

UGICED/UN  THRESHOLD  *  ■ 350000 

SPEECH  SCALED  IN  CODER)*  1.00000 

SILENCE  THRESHOLD  350.000 

UNUOICED  GAIN  FACTOR  .100000 

WRITE  DECISION  UARIABLES  TO  SAME  FILE? 

<  1-YES/ 0-NO)*  1 

PRINT  ARRAY  ON  PRINTRONICS?( 1-Y.0-N)  1 
PROGRAM  COMPLETED 


STOP 

R 
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DECVARR 


DATE  :  2/12/83 

TIME  :  13:56:47 

ACCEPT/NOT  ACCEPT  1 

NUMBER  OF  POLES  10 

METHOD  (O-AUTO, 1-COVAR)  0 

MAXFR  160 

MAXPT  80 

PRE/DE-EMP?  ( 1— Y»  0— N)  1 

GLOTTAL  PULSE  < 1-POLY  ,3-IMPULSE)  3 

HAMMING  WINDOW?  (1-Y, 0-N)  1 

METHOD  PITCH  DET  < O-SIFT,  1-AUTOC )  0 

PITCH  &  COEF's  F'M  SAME  FILE< 1-Y, 0-N)  1 

VOICED/UNVOICED  THRESHOLD  . 35000 

SPEECH  SCALE  1 . 00000 

SILENCE  THRESHOLD  350.  00000 

UNVOICED  GAIN  FACTOR  . 10000 
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PROGRAM  PREDICT 


FILE:  PREDICT.  FR 

DIRECTORY:  DP4: KOWN 

LANGUAGE:  FOR TRANS 

DATE:  SEP  83 

AUTHOR:  CRAIG  MCKOWN 

SUBJECT:  DIGITAL  PROCESSING  OF  SPEECH- 

LINEAR  PREDICTION  ANALYZER 


ARGUMENTS  TYPE  PURPOSE 

&  VARIABLES 


MAXPT 

INTEGER 

MAXFR 

INTEGER 

NSET 

INTEGER 

NFRAME 

INTEGER 

NPTS 

INTEGER 

K,  S,  KS,  JS.  JK 

INTEGERS 

PI 

INTEGER 

JUMP 

INTEGER 

SPEEFL1 

STRING 

SPEEFL2 

STRING 

DUMMY 

STRING 

PAR  AM 

STRING 

DECISION 

VARIABLES 

POLES 

INTEGER 

MP 

INTEGER 

NGLT 

INTEGER 

MPCH 

INTEGER 

NPCS 

INTEGER 

NEMP 

INTEGER 

HI 

INTEGER 

STHR 

REAL 

SCAF 

REAL 

THRESH 

REAL 

UNGA 

REAL 

SAMPLES  BETWEEN  PITCH  DETECTION 
SAMPLES  IN  ANALYSIS  WINDOW 
COUNTER  FOR  PITCH  FRAME  NUMBER 
COUNTER  FOR  LPC  FRAME  NUMBER 
NUMBER  OF  SAMPLE  POINTS  ANALYZED 
COUNTERS  (USED  FOR  BOOKKEEPING) 

COUNTER  FOR  NUMBER  OF  SAMPLES  TO 
START  OF  NEXT  LPC  ANALYSIS  WINDOW 
FLAG  INDICATING  NATURE  OF  PREVIOUS 
FRAME  OF  SPEECH  (VOICED. UNVOICED  OR 
SILENCE) 

NAME  OF  SPEECH  FILE  (FOR  LPC) 

NAME  OF  SPEECH  FILE  (FOR  PITCH) 

NAME  OF  FILE  HOLDING  DECISION  VARIABLES 
NAME  OF  FILE  TO  WHICH  LPC  DATA  IS 
WRITTEN  (ACTS  AS  TRANSMISSION  CHANNEL) 

NUMBER  OF  POLES  IN  THE  OUTPUT  FILTER 
METHOD  OF  PREDICTION 
(O-AUTOCOR', 1-COVARIANCE) 

GLOTTAL  PULSE  SHAPE 
( 1 -POLYNOM ' , 2— TR I GON ' .  3- 1 MPULSE ) 

METHOD  OF  PITCH  DETECTION 
(O-SIFT. 1-AUTOC ) 

P ITCH/LPC  FILES  THE  SAME  (O-NO. 1-YES) 
PRE/DE-EMPHASIS  (O-NO, 1-YES) 

HAMMING  WINDOW  (O-NO, 1-YES) 

VOICED/UNVOICED  THRESHOLD 
(USED  FOR  PITCH  DETECTION) 

SCALE  FACTOR  (INPUT  SPEECH  DIVIDED  BY 
THIS  TO  AVOID  OVERFLOW) 

SPEECH/SILENCE  THRESHOLD 
UNVOICED  GAIN  FACTOR  (OUTPUTS  UNVOICED 
INPUT  TO  OUTPUT  FILTER  MULTIPLIED  BY 
THIS  TO  PREV 
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VARIABLES  <CONT.  ) 


VAL 

SPEE 

SPCH 

AR 


RCOF 

AENRO 

PITCH 

PIT 

VOCD 


FUNCTION: 

THIS  PROGRAM  EMULATES  THE  ANALYSIS  OF  A  LINEAR  PREDICTIVE 
CODING  SCHEME.  IT  INPUTS  SAMPLED  SPEECH  DATA  AND  PRODUCES 
THE  PARAMETERS  REQUIRED  BY  A  VOCODER  TO  REPRODUCE  THE  SPEECH. 
THESE  PARAMETERS  ARE  WRITTEN  TO  A  FILE  WHICH  ACTS  AS  THE 
COMMUNICATION  CHANNEL.  FOR  MORE  INFORMATION,  SEE  MCKOWN 
THESIS. 

THE  FORM  OF  THE  CHANNEL  FILE  IS  COMPATIBLE  TO  THE  VOCODER 
PROGRAM  BY  THE  SAME  AUTHOR. 

PROGRAM  USE: 

THE  PROGRAM  IS  LOADED  WITH  THE  FOLLOWING  COMMAND: 

RLDR  PREDICT  I OF  SIFTB  ENER  DCOVAR  DIRECT  DAUTO  AUTOC  SFLIB© 

BEFORE  RUNNING  THIS  PROGRAM,  IT  IS  ADVISED  THAT  THE  USER 
CREATE  (OR  UPDATE)  A  FILE  CONTAINING  THE  DECISION  VARIABLES 
REQUIRED  TO  PROPERLY  EXECUTE  THIS  PROGRAM.  THIS  IS  EASILY 
ACCOMPLISHED  BY  USING  THE  PROGRAM  "SETUP.  "  SEE  USER'S  MANUAL 
FOR  THE  PROGRAM  "SETUP.  " 

I  RECOMMEND  THAT  A  MACRO  FILE  IS  EMPLOYED  TO  RUN  THIS 
PROGRAM  AND  THE  VOCODER  PROGRAM  USED  TO  SYNTHESIZE  THE 
SPEECH.  THE  MACRO  FILE  SHOULD  BE  OF  THE  FORM: 

PREDICT  SPEECHFILE1/C  SPEECHFILE2/P  DECVAR/I  CHANNELFILE/O 

THE  FILE  SPEECHFILE1  IS  THE  NAME  OF  THE  INPUT  SPEECH  FILE 
USED  FOR  THE  PREDICTOR  COEFFICIENT  GENERATION.  THE  FILE 
SPEECHFILE2  IS  THE  NAME  OF  THE  INPUT  SPEECH  FILE  USED  TO 
ACCOMPLISH  THE  PITCH  DETECTION.  THE  FILE  DECVAR  IS  THE  NAME 


I NT  ARRAY  DUMMY  ARRAY  TO  HOLD  THE  SAMPLED  SPEECH 

BEFORE  IT  IS  WRITTEN  TO  SPEE  &  SPCH 
REAL  ARRAY  ARRAY  HOLDING  DATA  FOR  LPC  COEFFICIENT 

GENERATION 

REAL  ARRAY  ARRAY  HOLDING  DATA  FOR  PITCH  DETECTION 

REAL  ARRAY  ARRAY  HOLDING  THE  LPC  COEFFICIENTS 

AR ( 1 ) =AO» AR(2)-A1.  .  .  AR (POLES )=AP. 

ONLY  AR ( 2 )  TO  AR ( POLES)  ARE  WRITTEN  TO 
THE  CHANNEL  FILE 

REAL  ARRAY  REFLECTION  COEFFICIENTS 

REAL  ARRAY  ENERGY  FROM  ENERGY  DETECTOR 

REAL  ARRAY  PITCH  FROM  PITCH  DETECTOR 

INTEGER  INTERPOLATED  PITCH  (WRITTEN  TO  CHANNEL) 

INTEGER  FLAG  INDICATING  NATURE  OF  SPEECH 

(WRITTEN  TO  CHANNEL) 

REAL  ALPHA.  ERROR  COEFFICIENT,  USED  AS  GAIN 

FOR  OUTPUT  CHANNEL.  COMPUTED  IN  COEF¬ 
FICIENT  GENERATING  ROUTINES. 
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OF  THE  FILE  WHICH  CONTAINS  THE  DECISION  VARIABLES. 

THE  NAME  CHANNELFILE  IS  THE  NAME  OF  THE  FILE  TO  WHICH  THE 
LPC  PARAMETERS  ARE  WRITTEN.  IT  MUST  HAVE  THE  SAME  NAME 
AS  THAT  WHICH  IS  USED  FOR  VOCODE. 


SUBROUTINES 

NAME: 

REQUIRED: 

LOCATION: 

PURPOSE: 

I  OF 

DP  4:  KOWN 

READS  RUN  MACRO  FILI 

SIFTB 

/  / 

PITCH  DETECTION 

AUTOC 

i  t 

PITCH  DETECTION 

ENER 

»  V 

ENERGY  DETECTION 

DAUTO 

s  / 

LPC  COEF  GENERATION 

DCOVAR 

/  / 

LPC  COEF  GENERATION 

DIRECT 

/  / 

DIRECT  FORM  FILTER 

NOTE: 

SOME  INFORMATION  IS  WRITTEN  TO  THE  SCREEN  TO  ASSURE  THE 
USER  THAT  THE  PROGRAM  IS  INDEED  RUNNING.  THE  RUN  TIME 
OF  THE  PROGRAM  IS  ABOUT  FOUR  MINUTES,  AND  IS  DEPENDENT 
UPON  THE  METHODS  OF  PITCH  DETECTION  AND  COEFFICIENT  GENER¬ 
ATION.  AS  WELL  AS  THE  NUMBER  OF  POLES  USED  FOR  ANALYSIS. 


SEE  USER'S  MANUAL  FOR  "VOCODE" 
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PROGRAM  VOCODE 


rfgr  ki  Ti'Bwi 


V 


FILE: 

VOCODE.  FR 

DIRECTORY 

DP4 : KOWN 

LANGUAGE: 

FORTRAN  5 

DATE: 

SEP  S3 

AUTHOR : 

CRAIG  MCKOWN 

SUBJECT: 

DIGITAL  PROCESSING  OF  SPEECH 

LINEAR  PREDICTION  VOCODER 

ARGUMENTS 

TYPE 

PURPOSE 

&  VARIABLES 

SPEEFL 

STRING 

NAME  OF  A  DUMMY  FILE  (NOT  USED) 

PAR  AM 

STRING 

NAME  OF  FILE  FROM  WHICH  LPC  DATA  IS 
READ  (ACTS  AS  TRANSMISSION  CHANNEL) 

DUMMY 

STRING 

NAME  OF  A  DUMMY  FILE  (MOT  USED) 

RUMMY 

STRING 

NAME  OF  THE  FILE  TO  WHICH  DIGITIZED 
SPEECH  IS  WRITTEN  (OUTPUT  FILE) 

SPEECH  HAS  BEEN  NORMALIZED  FOR  AUDIO 
OUTPUT  WITH  "AUDI OH I ST” 

AR 

REAL  ARRAY 

LPC  COEFFICIENTS  READ  FROM  THE  CHANNEL 
FILE.  AR ( 1 )  IS  THE  AR(2)  FROM  THE 
CODER  PROGRAM  "PREDICT" 

POLES 

INTEGER 

NUMBER  OF  POLES  OF  THE  OUTPUT  FILTER 

VOCD 

INTEGER 

FLAG  INDICATING  UNVOCD/VOICED  DECISION 
FROM  CODER 

PIT 

INTEGER 

PITCH  PERIOD  IN  SAMPLES  READ  FROM 
CHANNEL  FILE 

IX 

DP  I  NT 

SEED  NUMBER  FOR  SUBROUTINE  "UNVOCD" 

U 

REAL  ARRAY 

OUTPUT  OF  "VOICED"  OR  "UNVOCD"  -  INPUT 
TO  "THROAT" 

W 

REAL  ARRAY 

MEMORY  FOR  "THROAT" 

S 

REAL  ARRAY 

OUTPUT  OF  THROAT  -  VOCODED  SPEECH 

I  NTS 

INT  ARRAY 

INTEGER  VALUES  OF  S  -  WRITTEN  IN  BLOCK 
FORM  TO  OUTPUT  FILE 

X 

INT  ARRAY 

USED  FOR  WRBLK  AND  RDBLK 

IS,  IP. KS 

INTEGERS 

COUNTERS 

FUNCTION: 

THIS 

PROGRAM  EMULATES 

THE  VOCODER  OF  A  LINEAR  PREDICTIVE 

CODING  SCHEME.  IT  TAKES  AS  INPUTS  THE  LPC  PARAMETERS 
FROM  A  FILE  WHICH  ACTS  AS  THE  COMMUNICATION  CHANNEL.  AND 
USES  THESE  TO  REPRODUCE  DIGITAL  SPEECH.  FOR  MORE  INFORMATION 
SEE  THE  MCKOWN  THESIS. 

THIS  PROGRAM  WAS  WRITTEN  TO  BE  USED  IN  CONJUNCTION  WITH  THE 
LPC  CODER  PROGRAM,  •'PREDICT*'  BY  THE  SAME  AUTHOR. 
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PROGRAM  USE: 

THE  PROGRAM  IS  LOADED  BY  THE  FOLLOWING  COMMAND: 

RLDR  VOCODE  IOF  UNVOCD  DRAND  THROAT  GL0T3  GL0T2  GLOT1  ©FLIB© 

THE  PROGRAM  "PREDICT"  MUST  BE  EXECUTED  BEFORE  THIS  PROGRAM 
CAN  BE  USED;  THE  OUTPUT  OF  THAT  PROGRAM  IS  USED  AS  THE 
INPUT  FOR  THIS  PROGRAM.  IT  IS  RECOMMENDED  THAT  A  MACRO 
FILE  BE  USED  TO  RUN  THIS  PROGRAM.  A  SUGGESTED  FORMAT  IS: 

VOCODE  DUMMY/X  DUMMY/Y  CHANNELFILE/ I  OUTPUTSPEECH/O 

THE  FILE  DUMMY  IS  THE  NAME  OF  A  DUMMY  FILE.  IT  IS  NOT  USED 
SO  IT  DOES  NOT  HAVE  TO  EXIST.  THE  FILE  CHANNELFILE  MUST  BE 
THE  SAME  AS  IS  USED  FOR  THE  PREDICT  PROGRAM.  THE  NAME 
OUTPUTSPEECH  IS  FOR  THE  FILE  WHICH  CONTAINS  THE  VOCODED 
SPEECH. 

THE  OUTPUT  FILE  TO  THIS  PROGRAM  CONTAINS  A  DIGITAL  REPRE¬ 
SENTATION  OF  THE  OUTPUT  SPEECH.  TO  LISTEN  TO  THE  SPEECH, 
THE  PROGRAM  "AUDIOHIST"  MUST  BE  USED.  THE  OUTPUT  FILE  MUST 
BE  MOVED  TO  A  DIRECTORY  CONTAINING  THIS  PROGRAM  (E.  G. 

DPO: SPOUT).  TO  LISTEN  TO  THE  OUTPUT  SPEECH,  GET  INTO  THE 
DIRECTORY  WHICH  NOW  CONTAINS  THE  OUTPUT  SPEECH  FILE  AND  RUN 
THE  PROGRAM  "AUDIOHIST.  "  TO  THE  FIRST  PROMPT  TYPE  THE  NAME 
OF  THE  OUTPUT  SPEECH  FILE,  TO  THE  SECOND  PROMPT  TYPE  "1", 
AND  TO  THE  THIRD  PROMPT  TYPE  "2.  " 


SUBROUTINES 

NAME: 

REQUIRED: 

LOCATION: 

PURPOSE: 

IOF 

DP  4 

:  KOWN 

READS  RUN  MACRO  FILE 

UNVOCD 

/  / 

PRODUCES  NORMAL  RANDOM  NOISE 

DRAND 

/  / 

WHICH  DRIVES  THE  OUTPUT  FILTER 
FOR  UNVOICED  SPEECH 

PRODUCES  UNIFORMLY  DISTRIBUTED 

VOICED# 

ACTUAL 

FILE  NAMES 

ARE: 

NOISE  WHICH  IS  REQUIRED  BY 
"UNVOCD" 

PRODUCES  A  GLOTTAL  PULSE  FOR 
VOICED  SPEECH 

GLOT1 

/  t 

GLOTTAL  PULSE  SHAPE:  POLYNOM. 

GL0T2 

/  / 

GLOTTAL  PULSE  SHAPE:  TRIGONOM. 

GL0T3 

*  t 

GLOTTAL  PULSE  SHAPE:  IMPULSE 

THROAT 

/  / 

THE  OUTPUT  FILTER 

NOTE:  THE  INPUT  FILE  TO  THIS  PROGRAM  SHOULD  BE  IN  A  FORM  COMPATIBLE 
WITH  THE  CODER  PROGRAM  “PREDICT"  WRITTEN  BY  C.  MCKOWN.  ANY 
OTHER  FORM  WILL  GIVE  SPURIOUS  RESULTS. 
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TYPE  RUN1 . MC 

PREDICT  DP5: FIUE/C  DP5  =  FIUE/'P  DECUARR/I  SYNLPC/O 
UOCOOE  S4/X  LOUT/Y  SYNLPC/I  WORD : YOl/O 

R 


RUN1 

PROGRAM  PREDICT  RUNNING. 

10  POLES 

25  FRAMES  PROCESSED 
50  FRAMES  PROCESSED 
75  FRAMES  PROCESSED 
100  FRAMES  PROCESSED 
125  FRAMES  PROCESSED 
150  FRAMES  PROCESSED 
175  FRAMES  PROCESSED 
280  FRAMES  PROCESSED 
225  FRAMES  PROCESSED 
250  FRAMES  PROCESSED 
275  FRAMES  PROCESSED 
300  FRAMES  PROCESSED 
325  FRAMES  PROCESSED 

NPTS  =  22222  MSET  =  276 

STOP 

PROGRAM  UOCODE  RUNNING. 

NDONE  =  336 

SPEECH  UOCODED 

THE  MAX  UALUE  FOUND  WAS  2531 

STOP 

R 
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APPENDIX  B 


Program  Listings 
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FILENAME:  PREDICT. FR  DATE:  12:  2: S3  TIME:  13: 43:  7  PAGE 


Q ***#**************************************»***************************** 


C 

C 

C 

C 

C 

C 

C 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


PREDICT 
CRAIG  MCKOWN 
SEP  83 
F0RTRAN5 

THIS  PROGRAM  EMULATES  THE  INPUT  TO  A  LINEAR 
PREDICTION  ENCODER.  IT  DETERMINES  THE  PITCH 
PERIOD  OF  THE  INCOMING  SPEECH  AND  THE  PARAMETERS 
OF  THE  OUTPUT  FILTER  REQUIRED  TO  GENERATE  THE  SPEECH  AT  THE 
VOCODER.  THESE  PARAMETERS  ARE  WRITTEN  TO  A  FILE  (PARAM)  WHICH 
ACTS  AS  THE  TRANSMISSION  CHANNEL. 

THIS  PROGRAM  HAS  BEEN  WRITTEN  SO  THAT  MANY  OF  THE  DECISION 
VARIABLES  CAN  BE  SET  BY* THE  USER.  THIS  ALLOWS  THE  USER  TO  VARY 
THE  METHODS  OF  PREDICTION  OR  OTHER  PARAMETERS  WHICH  MAY  AFFECT 
THE  QUALITY  OF  THE  VOCODED  SPEECH. 

LOAD  LINE:  RLDR  PREDICT  IOF  ENER  SIFTB  DIRECT  DCOVAR  DAUTO 

AUTOC  SFLIB® 


PROGRAM: 

AUTHOR: 

DATE: 

LANGUAGE: 

FUNCTION: 


c*********************** *************************************** ********** 


C 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


VARIABLES-  PARAMETERS,  AND  ARGUMENTS 


MAXPT :  SAMPLES  BETWEEN  PITCH  DETECTION 

MAXFR :  SAMPLES  IN  ANALYSIS  WINDOW  (PITCH  &  COEFS) 

NFRAME:  KEEPS  TRACK  OF  THE  LPC  FRAME  NUMBER 

NSET:  KEEPS  TRACK  OF  THE  PITCH  FRAME  NUMBER 

K.  S.  KS.  US.  UK  :  COUNTERS 

NPTS:  NUMBER  OF  POINTS  ANALYZED 

SPEEFL1 :  NAME  OF  SPEECH  FILE  (FOR  LPC) 

SPEEFL2:  NAME  OF  SPEECH  FILE  (FOR  PITCH) 

PARAM:  FILE  TO  WHICH  LPC  DATA  IS  WRITTEN 

(ACTS  AS  THE  TRANMISSION  CHANNEL) 

DUMMY:  HOLDS  THE  DECISION  VARIABLES 

POLES:  NUMBER  OF  POLES  IN  THE  LPC  ANALYSIS 

VAL:  DUMMY  ARRAY  TO  HOLD  THE  SAMPLE  SPEECH  FROM  THE  RDDLK 

BEFORE  IT  IS  WRITTEN  TO  SPCH  &  SPEE 
SPCH:  ARRAY  TO  HOLD  DATA  NEEDED  FOR  PITCH  DETECTION 

SPEE:  ARRAY  TO  HOLD  DATA  NEEDED  FOR  LPC  COEFFICIENT  PREDICTON 

UUMP :  FLAG  DENOTING  THE  MOST  PREVIOUS  TYPE  OF  SPEECH  (VOICED, 

UNVOICED,  OR  SILENCE) 

PI  :  NUMBER  OF  SAMPLES  TO  START  OF  NEXT  LPC  ANALYSIS  WINDOW 

AR  :  LPC  COEFFICIENTS  : AR ( 1 ) =AO, AR < 2 ) =A1 ,  .  .  .  , AR ( N ) =AP 

ONLY  AR ( 2 )  THROUGH  AR (NPOLES)  ARE  SENT  THROUGH  THE  CHANNEL 
PIT:  PITCH  (DELAYED  BY  TWO  PITCH  DETECTION  FRAMES) 

UNGA:  UNVOICED  GAIN  FACTOR 

THRESH:  SILENCE  THRESHOLD 

SCAF:  SCALE  FACTOR 

STHR:  VO I CED/UNVOCD  THRESHOLD  FOR  THE  PITCH  DETECTOR. 

NGLT :  DECISION  VARIABLE  IDENTIFYING  THE  GLOTTAL  PULSE  SHAPE. 

MP:  DECISION  VARIABLE  IDENTIFYING  THE  METHOD  OF  PREDICTION. 

HI:  DECISION  VARIABLE  FOR  PRESENCE  OF  HAMMING  WINDOW. 

MPCH:  DECISION  VARIABLE  IDENTIFYING  METHOD  OF  PITCH  DETECTON. 


C ************************************************************************* 


INTEGER  MAIN  <  7 ) . SPEEFL1 (7), PARAM( 7 ) , DUMMY ( 7 ) ,  SPEEFL2 < 7 ) ,  HI 
INTEGER  VAL ( 1280 ) ,  POLES,  DELAY, S, F, PI.  INTVAR ( 10) , PIT, VOCD 
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INTEGER  NAL  < 1280 ) 

DIMENSION  SPCH ( 400 ) > PBUF< 100). PITCH(3) , AR(20) . RCOF<20) 

DIMENSION  SPEE ( 400 ) .  RELVAR < 10), AENRG ( 3 ) 

DATA  NPTS,  NFRAME,  NSET,  K,  JK,  JEND/O,  0,  0,  0,  O,  0/ 

DATA  S,  KS. JUMP. JS/1,  1,  1,  1/ 

DATA  P I  TCH/3*0.  0/,  AENRG/3#0.  0/ 

DATA  YMEMN/O.  0/ 

MAXPT  —  160  ;  DEFAULT  VALUES 

MAXFR  =  320  ; 

NFILES  =  4 

C***  CALL  I OF  AND  OPEN  ALL  REQUIRED  FILES. 

CALL  I OF < NFILES,  MAIN,  SPEEFL1 , SPEEFL2, DUMMY, PARAM, MS.  SI. S2. S3. S4) 

CALL  OPEN( 1, SPEEFL1,  1.  IER >  ;  SPEECH  FILE  (LPC) 

CALL  OPEN <4, SPEEFL2, I, JER )  >  SPEECH  FILE  (PITCH) 

IF< < IER.  NE.  1 ).  OR.  ( JER.  NE.  1 ) )  TYPE  "OPEN  FILE  ERROR  ",  IER. JER 

CALL  OPEN <2, DUMMY, 3, JER)  ;  DECISION  VARIABLES 

IF ( JER.  NE.  1 )  TYPE  "OPEN  FILE  ERROR  " , JER 

CALL  DFILW< PARAM, JER)  ;  LPC  PARAMETERS 

IF  (JER. EQ. 13)  GO  TO  40 

IF  ( JER.  NE.  1)  TYPE  "DELETE  FILE  ERROR  ", JER 
40  CALL  CFILW(PARAM, 2, JER) 

IF  ( JER. NE.  1 )  TYPE  "CREATE  FILE  ERROR  ",JER 
CALL  OPEN <3.  PARAM.  3,  JER) 

IF(JER. NE. 1)  TYPE  "OPEN  FILE  ERROR  ",JER 

C*+*  GET  DECISION  VARIABLES 

READ  (2.42)  < RELVAR ( I ) .  I«1 ,  10 ) 

READ  (2,43)  ( I NTVAR ( I ) , 1 « 1 , 15) 

42  FORMATOX,  F12.  5) 

43  FORMATOX,  110) 

IF ( INTVAR ( 1 ) . EQ. 1 )  GO  TO  45 

ACCEPT  "NUMBER  OF  POLES  IN  THE  LPC  FILTER:  ".POLES 

ACCEPT  "METHOD  OF  PREDICTION:  O-AUTOCORR,  1-COVAR IANCE" , MP 

ACCEPT  "METHOD  OF  PITCH  DETECTION:  O-SIFT,  1 -AUTOC " , MPCH 

ACCEPT  "THRESHOLD  (SILENCE/SPEECH):  ".THRESH 

ACCEPT  "PRE/DE-EMPHASIZE?  ( YES- 1 , NO-O ) :  " ,  NEMP 

ACCEPT  "UNVOICED  GAIN  FACTOR  (UNGA):  ", UNGA 

ACCEPT  "SCALE  FACTOR  (SCAF):  ", SCAF 

ACCEPT  "VOICED/UNVOCD  THRESHOLD:  " , STHR 

ACCEPT  "GLOTTAL  PULSE  SHAPE< 1 -POLY, 3-IMPULSE) :  ", NGLT 

ACCEPT  "HAMMING  WINDOW?  (l-Y.O-N):  ",H1 

ACCEPT  "PITCH  AND  COEFFICIENT  FILES  THE  SAME? ( O-NO, 1 -YES) ", NPCS 
GO  TO  46 

45  POLES  =  INTVAR (2) 

MP  *  INTVAR (3) 

MAXFR  *  INTVAR (4) 

MAXPT  *  INTVAR (5) 

NEMP  -  INTVAR (6) 

NGLT  -  INTVAR (7) 

HI  =  INTVAR ( 0 ) 

MPCH  *  INTVAR <9 ) 

NPCS  »  INTVAR (10) 

STHR  =  RELVAR ( 1 ) 
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SCAF  =  RELVAR ( 2 ) 

THRESH  =  RELVAR <3> 

UNGA  =  RELVAR (4) 

46  TYPE  POLES, "  POLES  " 

NPOLES  =  POLES  +  1 

MAXFR1  =  MAXFR  -  1  i FOR  HAMMING  WINDOW 

WRITE  BINARY ( 3 )  POLES, NEMP, UNGA, NGLT  ;  CHANNEL  WRITE 

C ***************************** ****** a-**********#*#**#******#**#******* 
C********************************************************************* 
C***  THE  OPERATIONAL  PROGRAM.  .  . 

C***  START  OF  LOOP 
50  CONTINUE 

C***  READ  IN  A  NEW  BLOCK  OF  DATA 

CALL  RDBLK< 1, K, VAL, 5, IER )  ;  READ  THE  SPEECH  INTO  AN  ARRAY 

IF  C  IER.  EQ.  9)  GO  TO  260 

IF  (IER.NE. 1)  TYPE  "READ  FILE  ERROR  ",  IER 

C***  START  A  NEW  FRAME  OF  PITCH  DETECTION 
52  NSET  =  NSET  +  1 

NPOINT  =  0 
F  =  S  +  MAXFR  -  I 
IF(NPCS.  EQ.  0)  GO  TO  60 
DO  55  J=S. F 

NP0INT=NP0INT+1 

SPCH  ( NPOINT  )  =  FLOAT < VAL ( U) ) /SCAF  ;  SPCH  FOR  SIFT  !<  ENER 
55  CONTINUE 

GO  TO  61 

C***  USED  IF  PITCH  AND  COEFFICIENT  FILES  ARE  DIFFERENT 

60  CALL  RDBLK<4, K, NAL, 5, IER)  >  READ  THE  SPEECH  INTO  AN  ARRAY 
IF  < IER.  EQ.  9)  GO  TO  260 

IF  (IER.NE. 1)  TYPE  "READ  FILE  ERROR  ",  IER 
DO  61  J=S, F 

NP0INT=NP0INT+1 

SPCH  ( NPOINT )  =  FLOAT  <  NAL  ( J )  )  /SCAF  j  SPCH  FOR  SIFT  S<  ENER 

61  CONTINUE 

C***  CALCULATE  ENERGY  IN  A  FRAME 

62  CALL  ENER  <  SPCH, THRESH, NEN, AENRG, MAXFR ) 

IF  (NEN.  EQ.  0)  GO  TO  65  i  NO  NEED  TO  GET  PITCH 

C***  CALL  TO  THE  SUBROUTINES  WHICH  PERFORM  PITCH  ANALYSIS 

63  IF(MPCH. EQ. 0)  CALL  SIFTA ( SPCH, PITCH, STHR, MAXFR ) 

IF(MPCH.  EQ.  1 )  CALL  AUTOC ( SPCH, P ITCH, STHR,  MAXFR ) 

65  DELAY  =  NSET  -  2 

IF  (DELAY. LE. 0)  GO  TO  225  ;  TRUE  PITCH  IS  DELAYED 


C***  GET  PREDICTION  COEFFICIENTS 
70  CONTINUE 

NFRAME  =  NFRAME  +  1 

IF< (MOD(NFRAME, 25) ).  EQ.  0)  TYPE  NFRAME,"  FRAMES  PROCESSED" 
IF(AENRG<3). LT. THRESH)  GO  TO  204  >  NO  NEED  TO  FIND  COEFFICIENTS 

DO  75  I  =  1,  20  ; 
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RCOF < I )  =0.0  ;  INITIALIZE  ARRAYS 

AR  < I )  =0.0  i  FOR  PREDICTION 

75  CONTINUE  ;  COEFFICIENTS 

AL  =  0.  0  ; 

NPOINT  =  0 

F  =  KS  +  MAXFR  -  I  ;  A  COUNTER 

C***  LOAD  ARRAY  AND  PREEMPHASIZE  FOR  COEFFICIENT  GENERATION 

IF(H1.  EQ.  0)  GO  TO  179  ;  NO  NEED  FOR  HAMMING  WINDOW 

IF < NEMP.  EQ.  0)  GO  TO  SI  i  NO  PRE-EMPHASIS 

DO  80  J  =  KS, F 

YMEMD  =  FLOAT ( VAL( J) )*( .  54-.  46*C0S ( NP0INT*6.  28318/MAXFR1 > ) 

SP1  =  YMEMD  -  .  9#YMEMN 

NPOINT  =  NPOINT  +  1 

SPEE ( NPOINT )  =  SP1/SCAF 

YMEMN  =  YMEMD 

80  CONTINUE 
GO  TO  32 

C***  NO  PRE-EMPHASIS 

81  DO  82  J  =  KS,  F 

SP1  =  FLOAT (VAL ( J) >  *( .  54-.  46*C0S ( NP0INT*6.  2831 8/MAX FR1 ) ) 
NPOINT  =  NPOINT  +  1 
SPEE  <  NPOINT )  =  SP1/SCAF 

82  CONTINUE 
GO  TO  189 

C***  NO  HAMMING  WINDOW 

179  IF<NEMP.  EQ.  0)  GO  TO  181  i  NO  PRE-EMPHASIS 

DO  180  J  =  KS. F 

YMEMD  =  FLOAT ( VAL < J) ) 

SP1  =  YMEMD  -  . 9*YMEMN 
NPOINT  =  NPOINT  +  1 
SPEE (NPOINT)  =  SP1/SCAF 
YMEMN  =  YMEMD 

180  CONTINUE 
GO  TO  189 

C***  NO  PRE-EMPHASIS 

181  DO  182  J  =  KS,  F 

NPOINT  =  NPOINT  +  1 

SPEE  <  NPOINT )  =  FLOAT ( VAL ( J ) ) /SCAF 

182  CONTINUE 

189  CONTINUE 


C***  CALL  TO  SUBROUTINE  TO  DETERMINE  THE  FILTER  COEFFICIENTS 
190  IF  (MP. EQ. 0)  CALL  AUTO(MAXFR, SPEE, POLES, AR, AL, RCOF) 

IF  (MP. EQ. 1)  CALL  COVAR (MAXFR, SPEE, POLES, AR. AL, RCOF) 

C***  CALCULATE  VALUES  TO  BE  WRITTEN  TO  THE  CHANNEL 

IF  (PITCH(3).  EQ.  0.  0)  GO  TO  200  ; UNVOICED  SPEECH 

C***  VOICED  SPEECH 

PIT  *  I NT (PITCH(3) > 

VOCD  =  1 
PI  =  2*P IT 

IF< JUMP. NE. 0)  PI  =  Pl/2  i  IF  PREVIOUS  SET  NOT  VOICED. 

IF(  (PI.  GT.  MAXPT).  AND.  <  JUMP.  EQ.  0)  )  PI  *  Pl/2 

JS  =  JS  +  PI  ;  MORE  FREQUENT  ANALYSIS 

KS  =  KS  +  PI 

JUMP  =  0 

GO  TO  210 
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C***  UNVOICED  SPEECH 
200  PIT  =  0 

VOCD  =  0 
PI  =  MAXPT 

IF  <  JUMP.  NE.  1 )  PI  =  Pl/2 
JS  =  JS  +  PI 
KS  =  KS  *■  PI 
JUMP  =  1 
GO  TO  210 
C***  SILENCE 
204  PIT  =  0 

AL  =  0.  0 
VOCD  =  2 
PI  =  MAXPT 
JS  =  JS  +  PI 
KS  =  KS  *-  PI 
JUMP  =  2 

DO  209  I  =  2.  20 
AR  <  I )  =  0.  0 
209  CONTINUE 


IF  PREVIOUS  SET  NOT  UNVOICED. 
MORE  FREQUENT  ANALYSIS 


C***  WRITE  COEFFICIENTS  TO  CHANNEL  FILE 
210  CONTINUE 

NPTS  =  NPTS  +  PI 

WRITE  BINARY (3)  VOCD. P 1 , P I T. AL  ; CHANNEL  WRITE 

WRITE  BINARYO)  ( AR ( J  > ,  J=2.  NPOLES )  ;  CHANNEL  WRITE 

X211  TYPE  VOCD.  PI,  PIT,  AL,  AENRGO) 

X  WRITE! 12. 212)  AL 

X212  FORMAT( IX. F12.  3) 

X  WRITE! 12. 213)  !AR!J), J=l, NPOLES) 

X213  FORMAT (9!1X»F12.  6  )  > 

X  ACCEPT  "CONTINUE?! 1-YES, O-NO) :  ", ICK 

X  IF  ( ICK.  EQ.  0)  GO  TO  290 


C***  BOOKKEEPING  ROUTINE 
220  IF(JS.  LE.  MAXPT)  GO  TO  70 

JS  =  JS  -  MAXPT 
225  CONTINUE 

S  =  S  +  MAXPT 

IF  (S.  LT.  768)  GO  TO  50 

S  =  S  -  256 

KS=  KS-  256 

K  *  K+l 

IF  <  JEND.  GT.  1)  GO  TO  270 
IF  (JEND.  EQ.  1 )  GO  TO  70 
GO  TO  50 


;  GET  PREDICTION  COEFFICIENTS 

>  GO  TO  START  OF  LOOP 

;  AFTER  DELAY  OF  TWO,  FINALLY  EXIT 
;  GO  TO  START  OF  LOOP 


260  CONTINUE 

JEND  =  JEND  +  1 

GO  TO  70  i  CALCULATE  PREDICTOR  COEFFICIENTS  AGAIN 


C******************************************************************** 

c  **************************************  **•»*******#*****#*##*•«■#****#■»* 


C***  EXIT  PROCESS 
270  MSET-NSET 
ICK  -  1 

WRITE  BINARYO)  ICK 


; TOTAL  #  OF  SETS 
;  AN  END-OF-FILE  INDICATOR 
;  CHANNEL  WRITE 


B-fc 


C***  CLOSE  THE  FILES 
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CALL  CLOSE  < 1 ,  IER ) 

CALL  CLOSE ( 4<  JER ) 

IF  (  <  IER.  NE.  1  ).  OR.  (JER.  NE.  1))  TYPE  "  CLOSE  FILE  ERROR  ", IER, JER 
CALL  CL0SE(2, IER) 

CALL  CLOSE <3, JER) 

IF  ( < IER.  NE.  1 ).  OR.  < JER. NE.  1 ) )  TYPE  "  CLOSE  FILE  ERROR  ",  IER, JER 

TYPE  "  NPTS  =  ",  NPTS,  "  MSET  =  ",  MSET 

STOP 

END 


FILENAME:  IOF.  FR 


DATE:  12:  2: B3 


TIME:  13:44:52 


PAGE 


SUBROUTINE  IOF!N,  MAIN,  FI,  F2,  F3,  F4,  MS,  SI,  S2,  S3,  S4> 
C****************************************#************************-**** 
C  ADAPTED  FROM  SUBROUTINE  WRITTEN  BY  LT.  SIMMONS  10  SEPT  91 

C 

C  THIS  FORTRAN  5  SUBROUTINE  WILL  READ  FROM  THE  FILE  COM.  CM 

C  (FCOM. CM  IN  THE  FORE  GROUND)  THE  PROGRAM  NAME, ANY  GLOBAL 

C  SWITCHES,  AND  UP  TO  FOUR  LOCAL  FILE  NAMES  AND  CORRESPONDING 

C  LOCAL  SWITCHES. 

C 

C  ARGUMENTS: 

C 

C  N  IS  THE  NUMBER  OF  LOCAL  FILES  AND  SWITCHES  TO  BE  READ  FROM 

C  (F)COM.  CM.  N  MUST  BE  1,  2.  3,  OR  4. 

C 

C  MAIN  IS  AN  ASCII  ARRAY  FOR  THE  MAIN  PROGRAM  FILE  NAME. 

C 

C  FI,  F2,  F3,  AND  F4  ARE  THE  FOUR  ASCII  ARRAYS  TO  RETURN  THE 

C  LOCAL  FILE  NAMES. 

C 

C  MS  IS  A  TWO-WORD  INTEGER  ARRAY  THAT  HOLDS  ANY  GLOBAL  SWITCHES 

C 

C  SI.  S2,  S3.  AND  S4  ARE  TWO-WORD  INTEGER  ARRAYS  THAT  HOLD  THE 

C  LOCAL  SWITCHES  CORRESPONDING  TO  FI  THROUGH  F4  RESPECTIVELY. 

C 

C*************************************************************** 


DIMENSION  MA I N <  7 ) , MS <  2 ) 

INTEGER  FI (7), F2(7). F3<7). F4(7), SI (2), S2(2), S3<2).  S4(2) 


C  CHECK  BOUNDS  ON  N 

IF!  !N.  LT.  1  ).  OR.  !N.  GT.  4)  >  STOP  ;  N  OUT  OF  BOUNDS 


C 


PROCESS  THE  DATA  IN  (F)COM.  CM 
CALL  GROUND! I)  j  FIND  OUT  WHICH 
IF!  I.  EQ.  O ) OPEN  0.  "COM.  CM" 

IF!  I.  EQ.  1 ) OPEN  0.  "FCOM.  CM" 

CALL  C0MARG!0, MAIN, MS, IER) 
IFdER.NE.  1)  TYPE"  COMARG  ERROR: 
WRITE! 10, 1 )  MAIN! 1 > 

FORMAT!'  PROGRAM  S13, 'RUNNING. 
CALL  COMARG <0, FI, SI, JER) 

IF! JER. NE. 1 )  TYPE"  COMARG 
IF!N.  EQ.  1 )  GO  TO  2 
CALL  COMARG < 0, F2, S2, KER) 

IF! KER.  NE.  1 )  TYPE"  COMARG 
IF!N.  EQ.  2)  GO  TO  2 
CALL  COMARG ! 0, F3, S3, LER) 

IF!LER. NE. 1 )  TYPE"  COMARG 
IF! N.  EQ.  3)  GO  TO  2 
CALL  COMARG ! 0. F4, S4. LER) 

IF < LER. NE.  1 )  TYPE"  COMARG 


GROUND  PROGRAM  IS  IN 
»  OPEN  CH.  0  TO  COM.  CM 
;  OPEN  CH.  0  TO  FCOM.  CM 
» READ  FROM  !F)COM.  CM 
”,  IER 

i TYPE  PROGRAM  NAME 

'  ) 


i  READ 

FROM 

(F)COM.  CM 

ERROR 

! FI ) : 

" ,  JER 

;  TEST 

N 

;  READ 

FROM 

!  F )  COM.  CM 

ERROR 

!F2>: 

",  KER 

;  TEST 

N 

i  READ 

FROM 

!F)COM.  CM 

ERROR 

!F3) : 

",  LER 

i  TEST 

N 

i  READ 

FROM 

(F)COM.  CM 

ERROR 

!F4): 

",  LER 

2  CLOSE  0 
RETURN 
END 
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FILENAME:  ENER.  FR  DATE:  12:  2:83  TIME:  13:42:24  PAGE 

C***************************************************************** 


THIS  SUBROUTINE  DETERMINES  WHETHER  A  FRAMES  ENERGY 
EXCEEDS  A  SILENCE  THRESHOLD  <NEN=0: SILENCEi  NEN=i : SPEECH) 

AENRG  IS  A  THREE-MEMBER  ARRAY  WHICH  HOLDS  A  MEMORY  OF  THE 
PREVIOUS  VALUES  OF  THE  COMPUTED  ENERGY. 


C#**«**«*«*********««#*«#*#«««*«#«#«#«**««««#**#*#*«#«##*#«««**««# 
SUBROUTINE  ENER (SPCH,  THRESH, NEN, AENRG, MAXFR ) 

DIMENSION  SPCH< 1 ) , AENRG < 1 > 

NEN  =  1  i  PRESET  DECISION  TO  SPEECH 

SUM  =  0.  0  ;  INITIALIZE  SUM 

DO  100  J=l, MAXFR 
SI  =  SPCH<  J) 

SUM  =  SUM  +  SI  *  SI  »  ENERGY  =  SUM  OF  SQUARES 

100  CONTINUE 

AENRG  <  3 )  =  AENRG ( 2 ) 

AENRG ( 2 )  =  AENRG < 1 ) 

AENRG ( 1 )  =  SUM 

IF  ( SUM.  LT.  THRESH)  NEN  =  O 

RETURN 

END 


FILENAME:  SIFTB.  FR  DATE:  12:  2:83  TIME:  13:43:30  PAGE 

C******************************************************************** 

c 

C  SIFT  ALGORITHM  PROCESSING  -  STEP  1 

C 

C  INPUT  PARAMETERS:  SPCH(J)  ( J=1 , 2,  .  .  .  , MAXFR ) 

C  THE  SPEECH  SIGNAL  TO  BE  PROCESSED  FOR  PITCH 

C 

C  OUTPUT  PARAMETER:  PITCH(J)  (J=l,2, 3) 

C  (UNITS  IN  SAMPLES) 

C 

C  NOTE:  PARAMETERS  FIXED  FOR  FS=8  KHZ 

C 

C********************************************************************* 
SUBROUTINE  SIFTA(SPCH, PITCH,  STHR, MAXFR) 

DIMENSION  SPCH < 1 ) ,  PBUF< 100), AF<4>, PF < 4 ) , DF ( 5 ) , D ( 5 ) , ABUF ( 33 ) 
DIMENSION  U< 100), A<  5), P<5>, RC(5), PITCHt 1 ) 

DATA  AF/1.  ,  -2.  340366,  2.  011900,  614109/ 

DATA  PF/.  0357082, 0069956. 0069956,  . 0357082/ 

DATA  P/1.  ,  4*0.  / 

MAX4  =  I NT < MAXFR/4 )  ; MAXFR  =  320  »  MAX4  =  80 

MAX80  =  MAX4 

AX4  =  FLOAT (MAX4)  -  1.  i AX4  =  79. 

AX5  =  AX4  -  4.  ;  AX5  =  75. 

MAX6  =  MAX 4  -  4 

C***  INITIALIZE  MEMORY  OF  DIRECT  TO  ZERO 

DO  10  J=l, 5 
DF(  J)=0.  0 
D  <  J )  =0.  0 
10  CONTINUE 

C***  PRE-FILTER,  DOWN-SAMPLER,  DIFFERENCER  AND  HAMMING  WINDOWER. 

UPREV=0.  0 
DO  20  J=l. MAXFR 

CALL  DIRECT ( AF, PF, 3, DF, SPCH< J > , SOUT ) 

IF  <MOD(  J,  4).  NE.  O)  GO  TO  20 
K=J/4 

PBUF  <  K ) =SOUT 

U < K )  =  <  SOUT— UPREV ) * < .  54-.  46*C0S<  <K-1.  )*6.  28318/AX4) ) 
UPREV=S0UT 
20  CONTINUE 

C***  COMPUTE  INVERSE  FILTER  COEFFICIENTS 

CALL  AUT0(MAX4, U, 4, A, ALP, RC) 

C***  PERFORM  INVERSE  FILTERING  AND  HAMMING  WINDOW 

DO  30  J=1 »  MAX80 

CALL  DIRECT (P, A. 4, D, PBUF< J), FOUT) 

IF  ( J.  LE.  4)  GO  TO  30 

PBUF ( U-4 ) =FOUT * < .  54-.  46#C0S<  < J-5)*6.  28318/AX5) ) 

30  CONTINUE 

C***  PERFORM  AUTOCORRELATION  ON  PITCH  BUFFER 

DO  25  JJ=1, 33 
J=JJ-1 

NMU*MAX6  -  J 
SUM=0. 

DO  15  1=1. NMJ 
1PJ=I+J 

SUM=SUM+PBUF ( I ) *PBUF < IP J) 

15  CONTINUE 

ABUF( JJ)=SUM 

25  CONTINUE  „  ,, 
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n 


C***  OBTAIN  PITCH  VALUES  FROM  LAST  THREE  FRAMES 
P1=PITCH(  1  )/4.  +  1. 

P2=PITCH<2>/4.  +  1. 

P3=P  ITCH<3)/4.  +  1. 

IF<PITCH(  1 ).  EQ.  0.  O)  PI  =0.0 
IF<PITCH<2).  EQ.  0.  0)  P2  =  0.  0 
IF<PITCH<3).  EQ.  0.  O)  P3  =  0.  0 
C***  GET  PEAK  WITHIN  RANGEC6. 32D 
L=6 

AMAX=ABUF(L) 

DO  35  J=6, 32 

IF(ABUF(J). LE. AMAX )  GO  TO  35 
AMAX=ABUF< J) 

L=J 

35  CONTINUE 

C***  TEST  FOR  MAX  EQUAL  ZERO 

IF  (AMAX.  EQ.  0.  )  GO  TO  60 

C***  TEST  FOR  LEFT  HAND  EDGE.  IF  ABUF(L)  IS  NOT  A  PEAK  SET  UNVOICED 
IF  <  ABUF  <L> . LT.  ABUF  <L-1 ) >  GO  TO  60 
C***  PERFORM  PARABOLIC  INTERPOLATION  ABOUT  LOCATION  L 

AA=ABUF(L-1 )-ABUF(L) 

AA= ( AA+ABUF ( L+ 1 ) -ABUF  <  L ) ) / 2. 

BB=  <  ABUF ( L+l ) —ABUF  <  L— 1 ) >/4. 

AP=ABUF  <  L ) -BB*BB/AA 
AL=L— BB/AA 
V=AP/ABUF ( I ) 

C***  TEST  WITH  VARIABLE  THRESHOLD 

IF  (L.  GE.  19)  GO  TO  40 
DD  =-l.  *<L-6.  )  / 13.  +2. 

GO  TO  50 
40  CONTINUE 

DD  =-l.  -ML-19.  )  / 13.  +1 
50  CONTINUE 

V=V/DD 

C***  DECISIONS 

IF ( V. GE. STHR )  GO  TO  70 
IF(P1.  EQ.  O.  )  GO  TO  60 
STHQ  =  .  9*STHR 
IF ( V.  GE.  STHQ)  GO  TO  70 

60  P0=0. 

GO  TO  80 
70  PO=AL 

80  IF < ABS< P 1-P3 ) .  LE.  .  375*P3)  P2=(Pl+P3>/2. 

C***  IF < PO  AND  PI  ARE  CLOSE)  AND  <P2  NOT  0)  BUT  P3  =  0,  THEN 
C***  USE  LINEAR  EXTRAPOLATION  FOR  P2  (COMING  OUT  OF  VOICED). 

IF  (P3.  NE.  0.  )  GO  TO  90 
IF (P2.  EQ.  0.  )G0  TO  90 
IF  ( ABS ( PO-P 1 ) .  GT.  0.  2*P 1 )  GO  TO  90 
P2=<2.  *P1)-P0 

C***  TEST  FOR  ISOLATED  "VOICED"  AND  INCORRECT  END  OF  "VOICED" 

90  IF  (PI.  NE.  0.  )  GO  TO  100 

IF  (ABS(P2-P3).  GT.  (.  375*P3)  )  P2=0. 

C***  UPDATE  FRAMES 

100  PITCH(3)=(P2  -  1.  >*4. 

P  ITCH ( 2 )  =  ( P 1  -  1.  )  *4. 

PITCH(  1  >  =  (P0  -  1.  )  *4. 

IF(P2.  EQ.  0.  0)  PITCH(3)  =0.0 

0-/2, 


IF < P 1 .  EQ.  0.  0)  PITCH (  2  ) 
IF < P0.  EQ.  0.  0)  PITCH<  1  ) 


0.  0 
0.  0 


RETURN 

END 
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FILENAME:  AUTOC.  FR  DATE:  12:  2:  83  TIME:  13:  43:  46  PAGE  lj 

C *********************************************************** 

C 

C  THIS  SUBROUTINE  CALCULATES  THE  PITCH  PERIOD 

C 

C  INPUT  PARAMETERS:  SPCH( J)  J=l, 2.  .  .  .  , MAXFR 

C  THE  SPEECH  SIGNAL  TO  BE  PROCESSED  FOR  PITCH 

C 

C  OUTPUT  PARAMETERS:  PITCH<J>  J=l,2»3 

C  THE  PITCH  IN  NUMBER  OF  SAMPLES 

C 

C  NOTE:  PARAMETERS  SET  FOR  FS  «  8KHZ 

C 

C************************************************************** 

SUBROUTINE  AUTOC  <  SPCH,  P I TCH,  STHR. MAXFR) 

DIMENSION  SPCH< 1 ) ,  AF ( 4  > , PF  <  4  > , DF ( 5  > , ABUF(33) , PBUF(400) 

DIMENSION  P ITCH < 1 ) 

INTEGER  MXFTHi MNFTH. MXLTH, MNLTH 

DATA  PF/.  0357082,  0069956,  0069956,  .  0357082/ 

DATA  AF/1.  ,  -2.  340366,  2.  011900,  614109/ 

AXFR  =  MAXFR  -  1 

C***  INITIALIZE  MEMORY  OF  DIRECT  TO  ZERO 
DO  10  I  =  1. 5 
DF  <  I )  =0.  O 
10  CONTINUE 

C***  PREFILTER  AND  FIND  PEAKS  IN  FIRST  &  LAST  THIRD  OF  FRAME 
C***  MINIMUM  OF  THESE  IS  CLIPPING  THRESHOLD 
NFIRTH  *  I NT <MAXFR/3> 

NLASTH  =  I NT  (  MAXFR *2/3) 

MXFTH  =  0.  0  ;  SET  COMPARATORS  TO  ZERO 

MXLTH  =  0.0 
DO  20  I  =  1, MAXFR 

CALL  DIRECT < AF, PF, 3, DF, SPCH( I ) , SOUT) 

PBUF  < I )  =  SOUT *  < .  54—.  46*C0S ( ( I  — 1 .  ) *6.  23 18/ AXFR ) ) 

X  PBUF  < I )  =  SOUT 

IF  (<  I.  LE.  NFIRTH).  AND.  (PBUF(  I )  .  GT.  MXFTH)  >MXFTH  =  PBUF(I) 

IF  <<  I.  GE.  NLASTH).  AND.  <PBUF(  I).  GT.  MXLTH)  )MXLTH  =  PBUF(I) 

20  CONTINUE 

IF(MXFTH.  LE.  MXLTH)  MXLTH  =  MXFTH  j MIN  PEAK  IS  MXLTH 
MXFTH  =  . 75*MXFTH 
MNFTH  =  .  50*MXFTH 
MXLTH  »  -(MXFTH) 

MNLTH  =  -(MNFTH) 


C***  CLIP  SPEECH 

DO  40  I  =  1, MAXFR 


IF(PBUF ( I ) .  LT.  MXFTH) 
PBUF  ( I  >  =  1.0 

GO  TO  40 

GO 

TO 

25 

25 

IF ( PBUF ( I ).  LT.  MNFTH) 
PBUF ( I )  =  .  5 

GO  TO  40 

GO 

TO 

26 

26 

IF < PBUF < I ) .  LT.  MXLTH' 

GO 

TO 

30 

IF ( PBUF (  I  ).  LT.  MNLTH) 

GO 

TO 

29 

PBUF  ( I )  =  0.  0 
GO  TO  40 

29  PBUF  ( I )  -  -.  5 

GO  TO  40 
PBUF  ( I )  =-1.0 

B-i4 


30 


PAGE 

40  CONTINUE 

C**«  COMPUTE  AUTOCORRELATIONS 
DO  60  JJ  =  1. 151 
J  =  JJ-1 
NMJ  =  MAXFR  -  J 
SUM  =  0.  0 
DO  50  I  =  1 . NMJ 
IPJ  =  I  +  J 

SUM  *  SUM  +  PBUF ( I ) *PBUF < IPJ) 

50  CONTINUE 

ABUF(JJ)  =  SUM 
60  CONTINUE 

C***  OBTAIN  PITCH  VALUES  FROM  LAST  THREE  FRAMES 
PI  =  PITCH!  1 )  *2.  5 
P2  =  PITCH ( 2 ) *2.  5 
P3  =  P ITCH  <  3 )  *2.  5 
L  =  16 

AMAX  =  ABUF  <  L ) 

DO  70  J  =  16, 150 

IF(ABUF( J). LE. AMAX)  GO  TO  70 
AMAX  =  ABUF  <  J ) 

L  =  J 

70  CONTINUE 

IF(AMAX.  EQ.  0.  0)  GO  TO  100  ; TEST  FOR  MAX  EQUAL  ZERO 

IF (ABUF <L) .  LT. ABUF!L-1 ) )  GO  TO  100  ; TEST  FOR  L.  H.  EDGE 

V  =  ABUF ( L ) /ABUF ( 1 ) 

AL  —  L 

C***  TEST  V  WITH  THE  THRESHOLD 
IF (V. GE. STHR )  GO  TO  110 
IF(P1.  EQ.  O.  0)  GO  TO  100 
STHQ  =  .  9*STHR 
IF(V.  GE.  STHQ)  GO  TO  110 

100  PO  =  0.  0 

GO  TO  120 
110  PO  =  AL 

120  IF(ABS(P1-P3). LE. . 375*P3>  P2=  <  P 1 +P3 ) /2. 

IF(P3.  NE.  0.  )  GO  TO  130 

IFCP2.  EQ.  0.  )  GO  TO  130 

IF ( ABS(P0-P1 ) .  GT.  0.  2*P1 )  GO  TO  130 

P2=(2. *P1 )-P0 

C***  TEST  FOR  ISOLATED  "VOICED"  &  INCORRECT  END  OF  "VOICED" 

130  IF<P1.  NE.  0.  0)  GO  TO  140 

IF ( ABS( P2-P3) .  GT.  < .  375*P3 )  )  P2  =  0.  0 
C***  UPDATE  PITCH 
140  PITCHO)  *  P2/2.  5 

PITCH< 2 )  *  Pl/2.  5 
PITCH! 1 )  =  PO/2. 5 

RETURN 

END 


noon 


FILENAME:  DIRECT. FR 


DATE: 


12: 


2:83  TIME:  13:44:38 

C************************************************************* 

THIS  ROUTINE  IMPLEMENTS  THE  DIRECT  FORM  FILTER. 

************************* ******************************  ******* 
SUBROUTINE  DIRECT < A< P, M, D,  XIN,  XOUT) 

DIMENSION  A(1)»P(1)»D<1) 

XOUT  »  0.  0 
D < 1 )  *  XIN 
DO  10  J  =  1.  M 

I  =  M  +  1  —  J 

XOUT  =  XOUT  +  D( 1+1 >*P< 1+1 ) 

D < 1 )  =  D < 1 )  -  A< 1+1 >*D< 1+1 > 

D  <  I  +1 )  =  D  <  I ) 

10  CONTINUE 

XOUT  =  XOUT  +  D  < 1 ) *P  < 1 ) 

RETURN 

END 


e-ifc 


FILENAME:  DAUTO. FR 


DATE:  12:  2:83 


TIME:  13:44:  6 


PAGE 


c 

C  SUBROUTINE  AUTO  AS  PRESENTED  ON  PAGE  219  OF  MARKEL  S<  GRAY 

C 

C  THE  ARITHMETIC  IN  THIS  SUBROUTINE  IS  PERFORMED  IN  DOUBLE 

C  PRECISION  TO  REDUCE  THE  EFFECTS  OF  ILL-CONDITIONING  OF  THE 

C  AUTOCORRELATION  MATRIX. 

C 

C******************************************************************* 


SUBROUTINE  AUTOIN, X, M, A, ALPHA, RC) 

DIMENSION  X< 1 ), A< 1 ), RC( 1 ) 

DOUBLE  PRECISION  DA ( 20 ) ,  DRC ( 20 ) , DR < 21 ) . DAL, S,  AT 

C***  SET  THE  INITIAL  VALUES  TO  ZERO 
DO  5  I  =  1,20 

DAI  I )  =  DBLEIO.  0) 

DRC I  I )  =  DBLEIO.  0) 

5  CONTINUE 


10 

15 

17 


20 


30 


40 


45 


MP=M+1 

COMPUTE  THE  AUTOCORRELATION  TERMS 
DO  15  K=l. MP 

DR  I K )  =DBLE  1 0.  0) 

NK=N-K+1 
DO  10  NP=1, NK 

DR  IK) =DR I K ) +DBLE I  X  I NP  >  *X I NP+K-1 ) ) 
CONTINUE 
CONTINUE 
DO  17  I  =  2.  21 

DR  I  I )  a  DR  I  I ) /DR  1 1 ) 

CONTINUE 

=  SNGLIDRI  1  )  > 

'<1(1)  =  DBLEI  1.  0) 

DRC I  1 ) =-DR 1 2 ) /DR  I  1 > 

DAI  1 )=DBLE< 1.  0) 

DA ( 2 ) =DRC 1 1 ) 

DAL=DR I  1 ) +DR I  2 ) *DRC I  1 ) 

DO  40  MINC=2, M 
S=DBLE 10. 0) 

DO  20  IP= 1 , MINC 

S=S+DR I MINC-IP+2 ) *DA IIP) 

CONTINUE 

DRC I MINC ) =-S/DAL 
MH=MINC/2+l 
DO  30  IP=2> MH 

IB=MINC-IP+2 

AT=DA< IP  >  +DRC (MINC ) *DAI IB ) 

DAI  IB  )  =DAI  IB  >  +DRC  I  MINC )  *DAI  IP  ) 
DAI IP)=AT 
CONTINUE 

DA  I MINC +  1 >=DRC (MINC ) 

DAL*DAL+DRC I MINC ) *S 
IF  I  DAL)  50,50,40 
CONTINUE 
DO  45  I  =  1,  20 

AID  »  SNGLIDAI  I  )  ) 

RCI I )  =  SNGLIDRC 1 1 ) ) 

CONTINUE 


1 


ALPHA  =  SNGL(DAL) 

ALPHA  =  SQRT  <  ALPHA*RO ) 

RETURN 

END 
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FILENAME:  DCOVAR. FR  DATE:  12:  2: 83  TIME:  13: 44: 23  PAGE 

C**************^************************#**************************** 

C 

C  SUBROUTINE  COVAR  AS  PRESENTED  ON  PG  221  OF  MARKEL  8,  GRAY 

C 

C  THE  NUMERICAL  MANIPULATIONS  REQUIRED  IN  THIS  ALGORITHM  ARE 

C  PERFORMED  IN  DOUBLE  PRECISION  ARITHMETIC  TO  COMBAT  POSSIBLE 

C  ILL-CONDITIONING  OF  THE  COVARIANCE  MATRIX. 

C 

C******************************************************************** 

SUBROUTINE  COVAR  <N» X. M, A. ALPHA. GRC ) 

DIMENSION  X(l). A(1).GRC(1> 

DOUBLE  PRECISION  B < 210 ) , BETA < 20 ) . CC < 21 ) 

DOUBLE  PRECISION  DA< 20 > . DGRC < 21 ) . DALPHA. S. GAM 

MP  *  M  +  1 

C***  SET  THE  INITIAL  VALUES  TO  ZERO. 

DO  299  1=1, 210 

B ( I )  =  DBLE<0.  0) 

299  CONTINUE 

DALPHA  =  DBLE<0.  0) 

CC( 1 .  =  DBLE(0.  0) 

CC (2)  =  DBLE<0.  0) 

C***  CALCULATE  THE  COVARIANCE  TERMS 
DO  10  NP  =  MP.N 
NP1  -  NP  -  1 

DALPHA  »  DALPHA  +  DBLE< X <NP)*X <NP> ) 

CC(1)  =  CC<1)  +  DBLE ( X  <  NP ) *X  <  NP 1 > ) 

CC < 2)  »  CC<2)  +  DBLE <  X  <  NP  1 )  *X  ( NP  1 ) > 

10  CONTINUE 

B  ( 1 )  =  DBLE<  1.  0) 

BETA(l)  =  CC (2) 

DGRC ( 1 )  =  -CC  < 1 ) /CC ( 2 ) 

DA< 1 >  «  DBLE ( 1 . 0) 

DA (2)  =  DGRC ( 1 ) 

DALPHA  =  DALPHA  +  DGRC ( 1 )*CC < 1 ) 

MF  »  M 

DO  130  MINC  -  2,  MF 
C***  CALCULATE  THE  COVARIANCE  TERMS 
DO  20  J  =  1 >  MINC 

JP  «  MINC  +  2  -  J 
N1  ■  MP  +  1  -  JP 
N2  =  N  +  1  -  MINC 
N3  =  N  +  2  -  JP 

CC<JP>  -  CC < JP-1 )+DBLE<  X (MP-MINC > *X (N1 ) ) 

X  -DBLE ( X  <  N2 ) *X ( N3 ) ) 

20  CONTINUE 

CC(1)  =  DBLEtO.  0) 

DO  30  NP  »  MP.N 

CC<1>  «  CC<1>  +  DBLE(X(NP— MINC)*X<NP) ) 

30  CONTINUE 

MSUB  -  <MINC*MINC-MINC>/2 
MM1  -  MINC  -  1 
B  <  MSUB+MINC  )  -  DBLE (1.0) 

DO  70  IP  «  1.MM1 

I SUB  ■  < IP*IP— IP ) /2 

3-n 
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IF  <BETA<  IP)  >  140.70,40 
40  GAM  *  DBLE(0.  0) 

DO  50  J  =  1. IP 

GAM  =  GAM+CC ( J+l >  *B  < ISUB+J ) 

50  CONTINUE 

GAM  =  GAM/BETA (IP) 

DO  60  JP  =  1,  IP 

B ( MSUB+ JP ) =>B  <  MSUB+JP ) -GAM*B ( ISUB+JP ) 
60  CONTINUE 

70  CONTINUE 

BETA(MINC)  -  DBLEtO.  0) 

DO  SO  J  =  l.MINC 

BETA(MINC)  =*  BETA(MINC)+CC( J+l >*B(MSUB+J) 
80  CONTINUE 

IF  ( BETA<MINC ) >  140,120,90 
90  '  S  -  DBLE(  0.  0) 

DO  100  IP  =  l.MINC 

S  =  S  +  CC( IP)*DA( IP ) 

100  CONTINUE 

DGRC(MINC)  =  — S/BETA(MINC ) 

DO  110  IP  *  2, MINC 

M2  =>  MSUB  +  IP  -  1 

DA ( IP  >  =  DA< IP )  +  DGRC ( MINC ) *B ( M2 ) 

110  CONTINUE 

DA  <  MINC+1 )  =  DGRC (MINC ) 

120  CONTINUE 

S  »  DGRC ( MINC )*DGRC( MINC )*BETA( MINC) 

DALPHA  -  DALPHA  -  S 
IF  (DALPHA)  140, 140, 130 
130  CONTINUE 

140  CONTINUE 

DO  150  I  »  1, MP 

A ( I )  ■  SNGL ( DA ( I ) ) 

GRC(I)  ■  SNGL ( DGRC ( I ) ) 

150  CONTINUE 

ALPHA  -  SNGL ( SORT ( DALPHA ) ) 

RETURN 

END 
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TIME: 
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FILENAME:  VOCODE. FR  DATE:  12:  2: 83 


C***************************************************************** 


c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


PROGRAM: 

AUTHOR: 

DATE: 


VOCODE. 

CRAIG  MCKOWN 

24  AUG  83  -  30  SEP  83 


FUNCTION:  USES  THE  OUTPUT  OF  “PREDICT"  (THE  LPC  CODER)  TO 

PRODUCE  OUTPUT  SPEECH.  THIS  IS  A  VOCODER. 


LOAD  LINE:  RLDR  VOCODE  I OF  UNVOCD  DR AND  GL0T1  GL0T2  GL0T3 

THROAT  SFLIB® 


C******************************************************************* 


C 

C 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


PARAM:  FILE  FROM  WHICH  LPC  DATA  IS  READ 

DUMMY:  A  DUMMY  FILE.  NOT  USED 

RUMMY:  FILE  TO  WHICH  NORMALIZED  VOCODED  SPEECH  IS  WRITTEN 

AR:  LPC  COEFFICIENTS  -  AR(1>  IN  THIS  PROGRAM  IS  THE  SAME 

AS  AR(2)  IN  THE  CODER  PROGRAM. 

U:  OUTPUT  OF  "UNVOCD"  OR  "VOICED"  -  AN  INPUT  TO  "THROAT" 

W:  MEMORY  FOR  "THROAT" 

S:  OUTPUT  OF  "THROAT"  -  VOCODED  SPEECH 

1NTS:  ARRAY  WITH  INTEGER  VALUES  OF  S 

X:  INTEGER  ARRAY  USED  FOR  WRBLK 

POLES:  THE  NUMBER  OF  POLES  OF  THE  OUTPUT  FILTER. 

VOCD:  FLAG  WHICH  DENOTES  VOICED/UNVOICED  DECISION  FROM  CODER 

PIT:  PITCH  INFORMATION  (PITCH  PERIOD  IN  SAMPLES) 

IX:  DOUBLE  PRECISION  SEED  NUMBER  FOR  SUBROUTINE  "UNVOCD" 


C********************************************************************* 

INTEGER  SPEEFL(7)»  PARAM(7)»  DUMMY (7). RUMMY(7). MAIN(7) 

INTEGER  POLES.  PI.  X ( 256 ), VOCD.  PIT.  INTS(200) 

DIMENSION  U(250).  AR(20), W(0: 20), S(250> 

DOUBLE  PRECISION  IX 


DATA  VALF.  NPTS.  N6,  N5/0.  0,  0,  0,  0/ 

DATA  IS,  IP,  KS, KEND/1, 0. 0. 0/ 

IX  »  DBLE ( 203 ) 

NDONE  -  O 
NFILES  »  4 

CALL  I  OF  (NFILES.  MAIN,  SPEEFL,  DUMMY,  PARAM,  RUMMY,  MS,  SI,  S2.  S3.  S4) 
CALL  DFILW( RUMMY,  IER) 

IF(  IER.  EQ.  13)  GO  TO  40 

IF ( IER.  NE.  1)  TYPE  "  DELETE  FILE  ERROR  ", IER 
40  CALL  CFILW(RUMMY, 3, 88. IER) 

IF  (IER.  NE.  1)  TYPE  "  CREATE  FILE  ERROR  ",  IER 
CALL  OPEN (4,  RUMMY, 3, IER) 

IF(IER.  NE.  1)  TYPE  "  OPEN  FILE  ERROR  ", IER 
CALL  OPEN (3, PARAM, 3. IER) 

IF(IER.  NE.  1)  TYPE  "  OPEN  FILE  ERROR  ", IER 


C***  READ  OVERALL  PAREMETERS  FOR  VOCODINO  OF  SPEECH 
READ  BINARY (3)  POLES, NEMP,  UNGA, NOLT 


C *********************** **************************************************< 
C***  SYNTHESIZE  ONE  [VARIABLE  LENGTH]  FRAME  OF  SPEECH 
100  CONTINUE 
C**»  READ  FRAME  PARAMETERS 

READ  BINARY (3, END-1001)  VOCD. PI. PIT. AL 
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DO  110  J=l,  POLES 

READ  BINARY(3,  END=1001)  AR(J> 

110  CONTINUE 

C***  SET  MEMORY  OF  OUTPUT  FILTER  TO  ZERO 
DO  120  I  =0,20 
W<I>  =  0.  0 
120  CONTINUE 

C#**  VOCD/UNVOCD /SILENCE  DECISION 

IF(VOCD.  EG.  1)  GO  TO  300  i VOICED  SPEECH 

IF(VOCD.  EG.  2)  GO  TO  400  i  SILENCE 

C***  UNVOICED  SPEECH 

CALL  UNVOCDIU.  PI.  IX) 

AL  »  AL*UNGA 

CALL  THROAT (U,  PI,  AR,  POLES,  AL,  S,  W) 

GO  TO  500 

C***  VOICED  SPEECH 
300  CONTINUE 

1F(NGLT.  EG.  1 )  CALL  V0ICED1  <U,  PIT,  PI  > 

IF(NGLT.  EO.  2)  CALL  V0ICED2<U, PIT, PI ) 

IF(NGLT. EG. 3)  CALL  V0ICED3(U,  PIT, PI ) 

CALL  THROAT <U,  PI.  AR,  POLES,  AL.  S.  W) 

GO  TO  500 
C***  SILENCE 
400  CONTINUE 

DO  450  I  =  1.P1 

S(I)  -  O.  O  i  AUTOMATICALLY  SET  S  TO  ZERO 

450  CONTINUE 

C***  DE-EMPHASIZE  AND  WRITE  SPEECH 

500  IF (NEMP.  EG.  O)  GO  TO  555  j NO  PRE/DE-EMPHASIS 

CONTINUE 
DO  550  J  -  1.P1 

IF<VALF.  CT.  2500.  )  VALF  -  2500.  ; 

IF ( VALF.  LT. -2500.  )  VALF  »  -2500. 

IF(  (VALF.  OT.  -0.  01).  AND.  (VALF.  LT.  0.  01  > )  VALF  =  0.  0 
VALD  -  S(J) 

IF ( VALD.  GT.  2000.  )  VALD  »  2000. 

IF  (VALD.  LT.  -2000.  >  VALD  «  -2000. 

VALE  =  VALD  +  .  9*VALF  <Y(Z)  =  X(Z)  +  .  9(Z**-1 )Y(Z) 
VALF  =  VALE 
INTS(J)  »  I NT (VALE) 

550  CONTINUE 

GO  TO  560 

C***  NO  DE-EMPHASIS 
555  CONTINUE 

DO  560  J-l.Pl 

INTS(J)  -  INT (S( J) ) 

560  CONTINUE 

C***  COUNTER  it  WRITE  ROUTINE 
IP  -  IP  ♦  PI 
L  ■  1 

IF(IP.  QE.  256)  00  TO  210  i  SPLITS!  I)  St  WRBLK( .  .  .  ,  X, .  .  .  > 
DO  200  I  -  IS, IP 

X(I)  -  INTS(L)  i LOAD  UP  X(I)  AS  REQUIRED 

L  ■  L  +  1 
200  CONTINUE 

GO  TO  240  (SKIP  WRBLK 

210  CONTINUE 

e-ii 


i  UNVOICED  GAIN  FACTOR 


;  POLY  GLOT  SHAPE 
i  TRIG  GLOT  SHAPE 
J  IMPULSE  GLOT  SHAPE 


i  RESET  IP 
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220 


230 

240 


IP  ■  IP  -  256 
DO  220  I  -  IS, 256 
X<I)  =  INTS(L) 

L  =  L  +  1 
CONTINUE 

CALL  WRBLK(4. KS,  X,  1,  IER ) 
IF  < IER.  EQ. 9)  GO  TO  1001 
IF ( IER. NE. 1 )  TYPE  “  WRBLK 
IF <  IP.  EQ.  0)  GO  TO  230 
DO  230  I  *  1, IP 

X(I)  -  INTS<L) 

L  -  L  +  1 
CONTINUE 
KS  -  KS  +  1 
IS  -  IP+1 
NDONE  =  NDONE  +  1 
GO  TO  100 


i LOAD  UP  X  < I ) 

jEND  OF  FILE 

ERROR  ON  FILE  #2  ",  IER 

j RESTART  LOAD  UP  OF  X<I> 

i INCREMENT  BLOCK  COUNT 


C***  SPEECH  VOCODED 
1001  CONTINUE 

TYPE  “  NDONE  =  ",  NDONE 
TYPE  "  SPEECH  VOCODED  " 

C***  NORMALIZATION  ROUTINE 

51  =  0.  0 

DO  700  J  *  O,  87 

CALL  RDBLK<4*  J,  X,  1,  IER) 

IF (IER. EQ. 9)  GO  TO  701 

IFdER.  NE.  1)  TYPE  "  RDBLK  ERROR  ON  FILE  #2  ",  IER 
DO  600  I  =  1,256 

N2  -  IABS<  X < I )  ) 

IF(N2. GT.  N5)  N5  *  N2  ;  CHECK  FOR  MAXIMUM  VALUE 

600  CONTINUE 

KEND  =  J 

700  CONTINUE 

701  CONTINUE 

TYPE"  THE  MAX  VALUE  FOUND  WAS  "  ,  N5 

52  »  2000.  0/FL0AT(N5) 

DO  800  J  »  O.  KEND 

CALL  RDBLK<4,  J.  X,  1,  IER) 

IFdER.  NE.l)  TYPE  "  READ  BLOCK  ERROR  ",  IER 
DO  750  I  *  1,256 

SI  -  FLOAT ( X (I ) ) *S2  i  NORMALIZE  TO  A  MAX  OF  2000 

Xd)  *  INT(Sl) 

750  CONTINUE 

CALL  WRBLK ( 4,  J,  X,  1,  IER) 

IF  (IER. NE. 1)  TYPE  "  WRITE  BLOCK  ERROR  ".IER 
800  CONTINUE 

IF  (KEND.  GE.  87)  00  TO  900  » PRECAUTIONARY  STEP  TO  AVOID  OVER¬ 
DO  840  I  -  1.256  >  LOADING  FILE  44 

Xd)  -  0 
840  CONTINUE 

DO  850  J  -  KEND.  87 

CALL  WRBLK(4.  J, X.  1. IER)  J SET  ALL  UNSET  BLOCKS  TO  ZERO 

850  CONTINUE 

C***  CLOSE  FILES  AND  CHECK  FOR  ARITHMETIC  ERRORS 
900  CALL  CLOSE (4. IER) 

CALL  CLOSE (3. JER) 

IF(  ( IER.  NE.  1 ) .  OR.  ( JER.  NE.  1 ) )  TYPE  "  CLOSE  FILE  ERROR  ".IER.  JER 
CALL  DVDCHK( IDIV1 ) 

iFdDIVl.  EQ.  1)  TYPE  "  DIVIDE  BY  ZERO  OCCURRED  " 
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CALL  OVERFL< IFL01 ) 

IF  (IFLOI.EQ.  1)  TYPE  "  OVERFLOW  OCCURRED  " 
IF  (IFLOI.EQ. 3)  TYPE  “  UNDERFLOW  OCCURRED  " 

STOP 

END 
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FILENAME:  UNVOCD.  FR 


DATE:  12:  2:  83 


TIME:  13:46:25 


PAGE 


C***************************************************«**********w* 

THIS  SUBROUTINE  CREATES  A  NORMAL  RANDOM  SEQUENCE  WHICH  WILL 
BE  A  RANDOM  NOISE  INPUT  TO  THROAT.  THE  OUTPUT  OF  THIS 
PROGRAM  IS  AN  ARRAY,  U<I>,  OF  LENGTH  AS  DETERMINED  IN  THE 
CALLING  ROUTINE. 


NOTE:  THIS  MUST  BE  LINKED  TOGETHER  WITH  DRAND. 


PARAMETERS: 


U<I> 

FRMSIZ 

IX 


DRAND< IX): 


OUTPUT  SEQUENCE 
LENGTH  OF  ARRAY 

DOUBLE  PRECISION  SEED  TO  THIS  ROUTINE 
A  NEW  IX  IS  GENERATED  BY  THE  PROGRAM 
TO  FEED  THE  NEXT  ITERATION. 

A  DOUBLE  PRECISION  FUNCTION  - 
GENERATES  UNIFORM  PDF. 


C***************************************************************** 
SUBROUTINE  UNVOCD ( U. FRMSI Z,  I X ) 


DOUBLE  PRECISION  Ul,  V,  W,  T,  X,  E,  E2.  ElO,  E3,  Z.  P,  PI.  F,  P2.  P3,  P4,  P5.  PI 
X  ,  PI2.  Al,  A2,  A3,  A4,  AS,  A6,  A7,  A8.  A9,  AlO,  A12.  A13.  A14,  A15,  A16.  A17,  A18 
DOUBLE  PRECISION  INTEGER  IX 
INTEGER  FRMSIZ 
DIMENSION  U<1> 

DATA  Al/.  884070402298758D0/, 

X  A2/1. 131 131635444180D0/, 

X  A3/.  986A55477086949D0/, 

X  A4 / .  958720824790463D0 / . 

X  A5/.  630834801 92 1960D0/, 

X  A6/.  755591 531667601D0/, 

X  A7/.  0342405037501 1  IDO/, 

X  A8/. 91 1312780288703D0/, 

X  A9/.  47972740422244 IDO/, 

X  A10/1. 1 0547366 102207D0/. 

X  Al 2/.  87283497667 1790D0/. 

X  A 13/.  049264496373 128D0/, 

X  A14/. 5955071380159401DO/ » 

X  Al 5/.  80557792442381 7D0/, 

X  A16/.  053377549506886D0/, 

X  A17/.  97331 0954 173898D0/, 

X  E/2.  2 16035867 16647 IDO/, 

X  A1B/.  1 8002519 1 068563D0/ , 

C***  CALCULATE  THE  NORMAL  FUNCTION 
PI  -  3.  1415926 536D0 
PI2  «  <PI*2.  0 )**<-.  5) 

E2  -  E**2.  0 
E3  -  E2/2.  O 
DO  2200  N  -  1, FRMSIZ 
Ul  -  DRAND <  X X ) 

IF(U1  . OT.  A1)G0  TO  1000 
V  »  DRAND ( IX ) 

X  «  E  *  <A2  #  Ul  +  V  -  1) 

GO  TO  9000 


1000  IF(U1  .LT.  A17)00  TO  1200 

1005  V  «  DRAND ( IX) 

W  -  DRAND (IX) 

T  -  E3  -  DLOG(W)  „  _ 

s-z> 
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1010 


1200 

1300 


1500 

1600 


1700 


1800 

1900 

9000 

2200 

2300 

2400 


E10  =  T  *  V**2 
IF  < E10  .  GT.  E3)G0  TO  1005 
IF(U1  .  LT.  A3) GO  TO  1010 
X  *  -<2.  0  *  T>**.  5 
GO  TO  9000 

X  =  <2.  0  *  T)**.  5 
GO  TO  9000 

IF(U1  .  LT.  A4)G0  TO  1500 

V  »  DR AND  < I X ) 

W  -  DR AND ( IX ) 

Z  -  V  -  W 

T  *  E  -  A5  *  DlilNl  < V,  W) 

P  »  DMAX1 ( V< W) 

IF<P  .  LT.  A6 ) GO  TO  1800 
PI  -  A7  *  DABS(Z) 

F  -  PI2*DEXP<-T*T/2.  0)-A18*<E— DABS(T) ) 

IF  <  P 1  . LE.  F)GO  TO  1800 
GO  TO  1300 

IF<  U1  .  LE.  A8)G0  TO  1700 

V  »  DR AND (IX) 

W  *  DR AND (IX) 

Z  =  V  -  W 

T  -  A9  A10  *  DMIN1 <V. W> 

P2  -  DMAX1 (Vj W) 

IF<P2  .  LE.  A121G0  TO  1800 
P3  «  A13  *  DABS <  Z  > 

F  *  PI2  *  DEXP<— T*T/2.  0)— A1B*(E— DABS(T) ) 
IF(P3  .  LE.  F)GO  TO  1800 
GO  TO  1600 

V  -  DR AND (IX) 

U  =  DRAND(IX) 

Z  =»  V  -  W 

T  »  A9  -  A14  *  DMIN1 ( V. W) 

P4  *  DMAXl(V.W) 

IF < P4  .LE.  A15)G0  TO  1800 
PS  ~  A16  *  DABS(Z) 

F  *  PI2  *  DEXP<— T*T/2.  0)-A18*(E— DABS(T) ) 
IF tP5  .LE.  F)GO  TO  1800 
GO  TO  1700 

IF(  Z  .  LT.  0.  0)G0  TO  1900 
X  ■  -T 
GO  TO  9000 

X  ■  T 

CONTINUE 

U(N)  *  SNGL(X) 

CONTINUE 

IF(FRMSIZ.  OE.  200)  GO  TO  2400 
DO  2300  I  *  FRMSIZ+1*  200 
U<I>  -  0.  0 
CONTINUE 
RETURN 
END 
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FILENAME:  DRAND.  FR  DATE:  12:  2:83  TIME:  13:48:  6  PAGE 

C***********************************************##******#***»***** 

THIS  FUNCTION  IS  A  UNIFORM  RANDOM  NUMBER  GENERATOR. 


C*«**************************************#*********************«*** 


DOUBLE  PRECISION  FUNCTION  DRAND < I X ) 

DOUBLE  PRECISION  INTEGER  IX, A, P, B15. B16, XHI. XALO,  LEFTLO, FHI.  K 
DATA  A/16B07D0/,  B15/32768D0/. B16/65536D0/.  P/2147483647D0/ 

XHI  »  IX/B16 

XHI  *  XHI  -  DMOD<XHI, IDO) 

XALO  *  (IX  -  XHI  *  B16)  *  A 
LEFTLO  »  XAL0/B16 

LEFTLO  *  LEFTLO  -  DMOD( LEFTLO,  IDO) 

FHI  *  XHI  *  A  +  LEFTLO 
K  *  FHI/B15 
K  =  K  -  DMOD(K, IDO) 

IX  *  < ( <XAL0-LEFTL0*B16)-P)  +  (FHI-K*B15)*B16)+K 

IF(  IX.  LT.  0.  DO)  IX  »  IX  P 

DRAND  =  IX  *  4.  656612875D— 5  /  1.  52304D0 

RETURN 

END 
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FILENAME:  GL0T1. FR 


DATE:  12:  2: 83 
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GLOTTAL  PULSE  SHAPE  - 


POLYNOMIAL  FUNCTION 


C*»****«*******«***«***************«*««##**«********«******************* 

C 

C  THIS  SUBROUTINE  PRODUCES  AN  INPUT  TO  THE  SYNTHESIS  FILTER 

C 
C 
C 
C 
C 
C 
C 
C 


A  NP  NN 

A  A  A 

TP— r^OTN— 


•1  PITCH  PERIOD- 


C 

c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 
c 

C***#*****************««***«**»****#*4Ht*****##***#**«*********** 
SUBROUTINE  VOICEDl (U. PPF, SIZE) 


INPUTS: 

PPF: 

SIZE: 

OUTPUTS: 

U(I): 


THE  PITCH  PERIOD 
THE  FRAME  STZE 


THE  OUTPUT  SEQUENCE  NEEDED  AS  INPUT  TO  "THROAT. 


DIMENSION  U<200) 
INTEGER  PPF. SIZE 


NPOS  =1 

TP  -  .030  *  FLOAT  (PPF) 

TN  =  .  012  *  FLOAT  (PPF) 

NP  =  I  NT  ( TP ) 

NN  *  I NT ( TP  +  TN) 

M-SIZE/PPF 
K  «  0 

DO  60  J  «  1 ,  M 

C***CALCULATE  ONE  FRAMES  WORTH  OF  U 
TIME  -1.0 
DO  50  1-1 . PPF 
K  -  K  +  1 

IF ( I  .  GT.  NP )C0  TO  20 

U(K)  -  (3.  *( TIME/TP )**2)  -  (2.  *(TIME/TP)**3) 

GO  TO  40 

20  IF ( I  . GT.  NN)00  TO  30 

U(K)  -  (1. -( (TIME-TP)/TN)**2) 

GO  TO  40 

30  U(K)  -  0.  0 

40  TIME  -  TIME  +  1.0 

50  CONTINUE 

60  CONTINUE 

DO  70  I  -  SIZE+1. 200 

U(I)  -  O.  :  ZERO  FILL  THE  REST  OF  THE  ARRAY 

70  CONTINUE 


FILENAME:  GLQT2. FR 


DATE: 
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GLOTTAL  PULSE  SHAPE  -  TRIGONOMETRIC 


C  *•******•*******■»■*#■*■#  *******************  ********************************* 

c 

C  THIS  SUBROUTINE  PRODUCES  AN  INPUT  TO  THE  SYNTHESIS  FILTER 

C 
C 
C 
C 
C 
C 
C 
C 
C 
C 
C 
C 
C 
C 

c 
c 
c 

C  INPUTS: 

C  PPF : 

C  SIZE: 

C 

C  OUTPUTS: 


'  NP  NN 

\  A  A 

'<-TP— TN— >■' 


■1  PITCH  PER IOD- 


THE  PITCH  PERIOD 
THE  FRAME  SIZE 


THE  OUTPUT  SEQUENCE  NEEDED  AS  INPUT  TO  "THROAT.  ” 


C  U(I): 

C 

C*************************************************************** 
SUBROUTINE  V0ICED2 ( U. PPF, SI ZE > 


DIMENSION  U<200> 
INTEGER  PPF, SIZE 


PI  ■  3.  14159 
P 12  =  PI/2.  0 
NPOS  =1 

TP  *  .  030  *  FLOAT (PPF) 

TN  =  .012  *  FLOAT  <  PPF ) 

NP  =  I NT ( TP  > 

NN  =  I NT (TP  +  TN) 

M  *  SIZE/PPF  ;  NUMBER  OF  PITCH  PERIODS  PER  FRAME 

K  -  0 

DO  60  J  ■  1,H 

C***  CALCULATE  ONE  FRAME  OF  U(I) 

TIME  =1.0 
DO  30  I  -  l.PPF 
K  =  K  +  1 

IF (I  . GT.  NP )G0  TO  20 

U(K)  *  (.  3>*(1. 0-C0S(TIME*PI/TP) ) 

GO  TO  40 

20  IF (I  .  GT.  NN)GO  TO  30 

U(K)  =  (.  3>*C0S< ( (TIME-TP)/TN)*PI2) 

GO  TO  40 

30  U(K)  «  0.  0 

40  TIME  =  TIME  +1.0 

50  CONTINUE 

60  CONTINUE 

DO  70  I  -  SIZE+1. 200 
U(I)  ■  0.  0 
70  CONTINUE 

0-J» 


RETURN 

END 
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FILENAME:  GL0T3.  FR  DATE:  12:  2:83  TIME:  13:46:  4  PAGE  1 


GLOTTAL  PULSE  SHAPE  -  IMPULSE  <TP=1,TN=0) 


C*********************^************************************************* 

C 

C  THIS  SUBROUTINE  PRODUCES  AN  INPUT  TO  THE  SYNTHESIS  FILTER 

C 
C 
C 
C 
C 
C 
C 
C 
C 
C 
C 

c 
c 
c 
c 
c 
c 
c 
c 

C  INPUTS: 

C  PPF: 

C  SIZE: 

C 

C  OUTPUTS: 


NP  NN 


^<-TP->^<-TN— >' 


■1  PITCH  PER IOD- 


THE  PITCH  PERIOD 
THE  FRAME  SIZE 


C  U( I ) :  THE  OUTPUT  SEQUENCE  NEEDED  AS  INPUT  TO  "THROAT.  " 

C 

C*************************************************************** 
SUBROUTINE  V0ICED3<U,  PPF, SIZE) 


DIMENSION  U < 200 ) 
INTEGER  PPF, SIZE 


TIME  -  1.  O 
G  =  1.0 
U<1)  a  G 
NPF2  *  PPF+2 
DO  400  K  *  2, PPF 
U(K)  a  0.  0 
400  CONTINUE 

U  <PPF+1 )  a  G 
DO  500  K  »  NPF2, SIZE 
U(K)  ■  0.  0 
500  CONTINUE 

RETURN 
END 
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C**#**#*******«****«#*«***********«********««*****«***«#******«* 

C 

C  THIS  SUBROUTINE  INPUTS  U  (A  SEQUENCE  OF  VOICED/UNVOICED 
C  INPUTS)  AND  PASSES  IT  THROUGH  A  TIME  VARYING  DIGITAL  FILTER 
C  TO  PRODUCE  AN  OUTPUT  SPEECH  SEQUENCE. 

C 


c 

INPUTS: 

c 

U(I): 

SEQUENCE  GENERATED  BY  EITHER  ••VOICED" 

OR  " UNVOCD.  " 

c 

EITHER  A  PULSE  AT  THE  PITCH  PERIOD  OR 

RANDOM  NOISE 

c 

ICOUNT: 

THE 

FRAME  LENGTH 

c 

FILTER: 

THE 

FILTER  COEFFICIENTS 

c 

NORDER: 

THE 

ORDER  OF  THE  FILTER 

c 

GAIN1: 

THE 

GAIN  OF  THE  FILTER,  AL  IN  THE  "VOCODE.  " 

c 

W(  I ) : 

THE 

MEMORY  OF  THE  FILTER 

L 

c 

OUTPUTS: 

c 

W(  I ) : 

THE 

MEMORY  OF  THE  FILTER 

c 

S  ( I ) : 

THE 

OUTPUT  SPEECH  SEQUENCE 

C 

C**************************************************************** 

SUBROUTINE  THROAT (U»  ICOUNT,  FILTER, NORDER, GAIN1,  S.  W> 

DIMENSION  U< 1 >»  FILTER ( 1 >,  W(0: 20), S<1) 

DO  500  N=l,  ICOUNT 
TOTAL  =  0.0 
DO  400  K=1 , NORDER 

TOTAL  *  TOTAL  -  W(K)  *  FILTER (K) 

400  CONTINUE 

W<0)  *  TOTAL  +  GAIN1  *  U<N) 

S ( N )  =  W(0) 

DO  450  1*1, NORDER 
J  ■  NORDER  +1-1 
W<J>  =  W(J-1> 

450  CONTINUE 

W<0)  =  0.  0 
500  CONTINUE 

IF< ICOUNT. GE.  200)  GO  TO  1000 


DO  600  I 

=  ICOUNT+1. 200 

S(I) 

■  0.  0 

600 

CONTINUE 

1000 

RETURN 

END 
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2:  83 


TIME:  13:48:19 
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FILENAME:  SETUP.  FR  DATE: 


C******************************************************************* 


C 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


PROGRAM: 
AUTHOR : 
DATE: 
LANGUAGE: 
FUNCTION: 


SETUP 

WILL  JANSSEN  /  REVISED  BY  C  MCKOWN 
17  APRIL  83  /  ON  2  SEPT  83 

F0RTRAN5 

THIS  PROGRAM  ALLOWS  THE  USER  TO  SETUP  A  FILE  THAT 
CONTAINS  INFORMATION  REQUIRED  TO  RUN  THE 
LINEAR  PREDICTIVE  CODER  WRITTEN  BY  CRAIG  MCKOWN. 
THE  PROGRAM  WILL  ALLOW  THE  USER  THE  FOLLOWING 
OPTIONS. 

1)  CREATE  A  NEW  FILE 

2)  UPDATE  AN  OLD  FILE 

3)  PRINT  PARAMETERS 


LOAD  COMMAND  LINE:  RLDR  SETUP  SFLIB® 

NOTE:  1)  THE  ARRAYS  ARE  SET  TO  MAX  OF  10  VARIABLES 

EACH. 

2)  THE  REAL  ARRAY  IS  CALLED  RELVAR  AND  THE 
INTEGER  ARRAY  IS  CALLED  I NT VAR 


C*********************************************************************** 


c 

C  SETUP 

C 

C************ 


DIMENSION  RELVAR (10). INTVARt 15), 0UTFILEC7) 

INTEGER  YES.  YES2.  SIZER.  SIZEI, YES5 
C***  SIZER-REAL  ARRAY  SIZE.  SIZEI-INTEGER  ARRAY  SIZE 
SIZER  «  10 
SIZEI  -  15 

C***  NEW  OR  OLD  FILE  *** 

TYPE  "THIS  PROGRAM  CREATES  OR  UPDATES  A  DECISION  VARIABLE  FILE. " 
TYPE  "ARE  YOU  UPDATING  AN  OLD  FILE?” 

ACCEPT" < 1-YES, O-NO) ". YES 

C***  GET  FILE  NAME  *** 

20  ACCEPT "FILE  NAME?  " 

READUl.  39)0UTFILE(1) 

39  FORMAT < SI 3) 

IF  (YES.  EQ.  1>G0  TO  30 
CALL  DFILW(OUTFILE. JER) 

IFtJER.  NE.  13)  TYPE  "YOU  DELETED  A  CURRENT  FILE!" 

IF< (JER.  NE.  1 ) .  AND.  ( JER.  NE.  13) )  TYPE  "DELETE  FILE  ERROR", JER 
CALL  CFILW(OUTFILE.  2. JER) 

IF  (JER. NE. 1)  TYPE  "CREATE  FILE  ERROR! ".JER 
30  CALL  0PEN(1.  OUTFILE,  3.  IER) 

IF(  IER  .NE.  1 )  TYPE  "OPEN  ERROR  ".IER 

C***  INITIALIZE  THE  ARRAYS, NEW  FILES-SET  -  TO  0, OLD  FILES-READ  IN 
C  OLD  FILES*** 

IF  (YES.  EQ.  1)00  TO  50 
DO  45  1-1. SIZER 
45  RELVAR  ( I )  -  0.  0 
DO  47  1-1. SIZEI 
47  I NT VAR ( I )  -  0 

GOTO  60 

READ  ( 1 . 90 1 ) ( RELVAR ( I ) » 1-1,  SIZER) 
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READ  ( 1 •  902)  (  INTVAR <  I ) »  I»l»  SIZED 

C***  UPDATE  ARRAYS  *** 

60  CONTINUE 

TYPE"  <CR> 

X  IF  YOU  CHOOSE  TO  CHANGE  A  VARIABLE  ENTER  :  Y  <CR> 

X  OTHERWISE  ENTER  ANOTHER  LETTER  <CR>  " 

TYPE  “  “ 

TYPE "CURRENT  VALUE  OF  ACCEPT/NOT  ACCEPT  (A-0. NA-l):  INTVAR  ( 1 ) 

TYPE "CHANGE  VALUE? 

CALL  RCHAR (ICHAR, 1ER) 

IF< ICHAR.  NE.  89) GO  TO  5000 

ACCEPT"  <CCR>  INPUT  NEW  VALUE  :  ", INTVAR ( 1 ) 

5000  TYPE  "  " 

TYPE "CURRENT  NUMBER  OF  POLES  IS  :  ".  INTVAR (2) 

TYPE"CHANGE  VALUE? 

CALL  RCHAR (ICHAR.  IER) 

IF (ICHAR.  NE.  89) GO  TO  9001 

ACCEPT"  <CR>  INPUT  NEW  VALUE:  ",INTVAR<2) 

5001  TYPE  "  " 

TYPE "THE  METHOD  OF  PREDICTION  IS  " 

TYPE" < AUTO-O.  COVAR-1 > :  ",  INTVAR (3) 

TYPE "CHANGE  VALUE? 

CALL  RCHAR ( ICHAR,  IER> 

IF( ICHAR.  NE.  89)00  TO  5002 

ACCEPT" <CR>  INPUT  NEW  VALUE:  ", INTVAR (3> 

5002  TYPE  "  " 

TYPE" CURRENT  VALUE: NO.  OF  POINTS/SET  < MAXFR ) :  ", INTVAR (4> 

TYPE" CHANCE  VALUE? 

CALL  RCHAR( ICHAR.  IER) 

IF< ICHAR. NE.  89)G0  TO  5003 

ACCEPT"<CR>  INPUT  NEW  VALUE:  ", INTVAR (4) 

5003  TYPE  "  " 

TYPE "THE  CURRENT  VALUE  OF  FILTER  SPACINGS  IS  (MAXPT) :  ", INTVAR<5) 
TYPE" CHANCE  VALUE? 

CALL  RCHAR (ICHAR,  IER) 

IF( ICHAR.  NE.  89) GO  TO  5004 

ACCEPT"<CR>  INPUT  NEW  VALUE:  ”,INTVAR(5) 

5004  TYPE  ”  " 

TYPE "THE  CURRENT  VALUE  OF  PRE/DE-EMP  (l-Y.O-N)  IS:  ”, INTVAR (6) 
TYPE"CHANOE  VALUE? 

CALL  RCHAR (ICHAR, IER) 

IF( ICHAR.  NE.  89)00  TO  9005 

ACCEPT "<CR>  INPUT  NEW  VALUE:  ", INTVAR<6) 

5005  TYPE  "  " 

TYPE" THE  CURRENT  VALUE  OF  OLOTTAL  SHAPE  IS  " 

TYPE" (1-POLYNOMIAL,  3-IMPULSE)  :  ", INTVAR(7) 

TYPE"CHANOE  VALUE? 

CALL  RCHAR (ICHAR, IER) 

IF (ICHAR.  NE.  89) CO  TO  5006 

ACCEPT" <CR>  INPUT  NEW  VALUE:  ", INTVAR(7) 


5006 


TYPE 


II  II 


PACE  3 

TYPE "THE  CURRENT  VALUE  OF  HAMMING  WINDOW  (O-NO.  1-YES) :  ",  INTVAR(B) 

TYPE" CHANGE  VALUE? 

CALL  RCHAR< ICHAR, IER) 

IFdCHAR.  NE.  89) GO  TO  5007 

ACCEPT" <CR>  INPUT  NEW  VALUE:  ",INTVAR(8> 

5007  TYPE  "  " 

TYPE"THE  METHOD  OF  PITCH  DETECTION  IS  " 

TYPE" < SIFT-O, AUTOC-1):  ", INTVAR<9> 

TYPE "CHANGE  VALUE?  " 

CALL  RCHARt ICHAR, IER) 

IFdCHAR.  NE.  89) GO  TO  5008 

ACCEPT" <CR>  INPUT  NEW  VALUE:  ", INTVAR<9> 

5008  TYPE  "  " 

TYPE"PITCH  DET'N  AND  COEF.  CAL 'N  FROM  SAME  FILE?" 

TYPE "CURRENT  VALUE  < 1-Y, 0-N) :  ",  INTVAR < 10) 

TYPE"CHANGE  VALUE? 

CALL  RCHAR (ICHAR, IER) 

IFdCHAR.  NE.  89)G0  TO  5010 

ACCEPT "<CR>  INPUT  NEW  VALUE:  ", INTVAR (10) 

5010  TYPE  "  " 

TYPE "THE  CURRENT  VALUE  OF  VOICED/UN  THRESH  IS:  " , RELVAR ( 1 ) 

TYPE "CHANGE  VALUE? 

CALL  RCHAR (ICHAR. IER) 

IFdCHAR.  NE.  89)G0  TO  5011 

ACCEPT "<CR>  INPUT  NEW  REAL  VALUE:  " . RELVAR ( 1 ) 

5011  TYPE  "  " 

TYPE" CURRENT  VALUE  OF  SPEECH  SCALE-dN  CODER):  ".RELVAR (2) 

TYPE" CHANGE  VALUE? 

CALL  RCHAR (ICHAR,  IER) 

IFdCHAR.  NE.  89)00  TO  5015 

ACCEPT "<CR>  INPUT  NEW  REAL  VALUE:  " . RELVAR ( 2 ) 

5015  TYPE  "  " 

TYPE  "CURRENT  VALUE  OF  SILENCE  THRESH-dN  ENER )  IS:  ”,RELVAR(3) 
TYPE" CHANGE  VALUE? 

CALL  RCHAR ( ICHAR, IER) 

IF ( ICHAR.  NE.  89)00  TO  5016 

ACCEPT" <CR>  INPUT  NEW  REAL  VALUE:  ", RELVAR (3) 

5016  TYPE  "  " 

TYPE”CURRENT  VALUE  OF  UNVOICED  GAIN  FACTOR  IS:  " , RELVAR ( 4 ) 

TYPE "CHANGE  VALUE?  " 

CALL  RCHAR (ICHAR,  IER) 

IF( ICHAR.  NE.  89)00  TO  5020 

ACCEPT "<CR>  INPUT  NEW  REAL  VALUE:  " , RELVAR ( 4 ) 

5020  CONTINUE 

C***  TYPE  ARRAY  *#* 

TYPE "THE  ARRAYS  HAVE  BEEN  LOADED" 

ACCEPT"DO  YOU  WANT  TO  HAVE  THE  ARRAY  TYPED ( 1 -YES,  O-NO ) :  ",  YES 
IF (YES  .  EG.  0)00  TO  200 

TYPE"  ACCEPT/NOT  ACCEPT:  ",INTVAR(1) 

TYPE"  NUMBER  OF  POLES:  ",  INTVAR (2) 

TYPE"  METHOD  (O-AUTO,  1-COVAR,  ) :  ",  INTVAR <3> 

TYPE"  MAXFR:  ", INTVAR(4> 
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TYPE"  MAXPT : 

TYPE"  PRE/DE-EMP  (1-Y, O-N): 

TYPE"  GLOT  (1 -POLYNOMIAL,  3- IMPULSE): 
TYPE"  HAMMING  WINDOW?  (1-Y,  O-N): 

TYPE"  METHOD  PITCH  DET  < O-SIFT, 1-AUTOC ) : 
TYPE"  PITCH  &  COEF'S  SAME  FILE( 1-Y, O-N) : 
TYPE" VOICED/UN  THRESHOLD: 

TYPE"SPEECH  SCALE-UN  CODER): 

TYPE" SILENCE  THRESHOLD 
TYPE "UNVOICED  GAIN  FACTOR 


",  I NTVAR ( 5 ) 
",  I NT VAR (6) 
",  I NTVAR < 7 ) 
",  I NTVAR < 8 ) 
",  I NTVAR  <  9 ) 
",  I NTVAR < 10) 
",  RELVAR ( 1 ) 
",  RELVAR (2) 
",  RELVAR (3) 
",  RELVAR (4) 


C***  OUTPUT  FILE  *** 


200  TYPE  "WRITE  DECISION  VARIABLES  TO  SAME  FILE?" 

ACCEPT  " ( 1-YES, O-NO) :  ", YES2 
IF  ( YES2  .EG.  1  )G0  TO  73 
CALL  CLOSE ( 1, IER ) 

IF  (IER  .  NE.  1 ) TYPE"CLOSE  FILE  ERROR 1  ",  IER 
ACCEPT"FILE  NAME?  " 

READ ( 1 1 , 69 ) OUTF I LE ( 1 ) 

69  FORMAT (SI 3) 

CALL  DFILW(OUTFILE, JER ) 

IF(JER.  EG.  13)  TYPE  "YOU  DELETED  A  CURRENT  FILE!" 

IF( (JER. NE.  1).  AND.  (JER. NE.  13) )  TYPE  "DELETE  FILE  ERROR", JER 
CALL  CFILW(OUTFILE,  2, JER) 

IF  (JER.  NE.  1)  TYPE  "CREATE  FILE  ERROR!", JER 

70  CALL  0PEN(1, OUTFILE. 3, IER) 

I F  ( I ER  .NE.  1 )  TYPE  "OPEN  ERROR  ",  IER 
73  CALL  REWIND(l) 

WRITE  ( 1 , 90 1 ) ( RELVAR ( I ) ,  1  =  1,  SIZER) 

WRITE  (1,  902)  (INTVAR(I),  I«l,  SIZED 
CALL  CLOSED,  IER) 

IF  (IER  .NE.  1 ) TYPE "CLOSE  FILE  ERR0R2  ", IER 
ACCEPT"PR INT  ARRAY  ON  PRINTRONICS?( 1-Y, O-N)  ",  YES 
IF (YES  .EG.  0)00  TO  1001 
WR I TE ( 1 2.  1499)0UTFILE< 1 ) 

CALL  FGDAYdMON,  IDAY,  I  YEAR  ) 

CALL  FGTIMEdHOUR.  IMIN,  ISEC) 

WRITE  (12,  131  DIDAY.  IMON,  I  YEAR 
WRITE  (12. 1312) IHOUR,  IMIN,  ISEC 

1311  FORMATC’O".  "DATE  :  ",  IX,  12,  "/".  12.  "/",  12) 

1312  FORMAT (  "0",  "TIME  :  ",  IX,  12,  ":  ",  12,  ":  ",  12) 

WRITE (12, 1300) INTVAR ( 1 ) 

WRITE (12, 1301) INTVAR (2) 

WRITE( 12, 1502) INTVAR (3) 

WRITE( 12, 1303) INTVAR (4) 

WRITE (12. 1504) INTVAR (3) 

WRITE (12, 1303) INTVAR (6) 

WRITE ( 12, 1 306 ) I NTVAR ( 7 ) 

WR I TE ( 12, 1307) INTVAR (8) 

WR I TE (1 2, 1 508 ) I NTVAR  <  9 ) 

WRITE( 12, 1309) INTVAR< 10) 

WRITE (12, 1 600 ) RELVAR ( 1 ) 

WRITE <12, 1601 )RELVAR(2) 

WRITE <12. 1602) RELVAR (3) 

WRITE (12, 1603) RELVAR (4) 

1499  FORMAT (IX.  SI 3) 

1300  FORMATC’O","  ACCEPT/NOT  ACCEPT  ",  16) 

1301  FORMATC’O"."  NUMBER  OF  POLES  ",  16) 

1302  FORMATC’O"."  METHOD  (O-AUTO,  1-COVAR )  ",  16) 

3 -J? 
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1503 

FORMAT  ("O'*,  “ 

MAXFR 

".  16) 

1504 

FORMAT < "0",  " 

MAXPT 

".  16) 

1505 

FORMAT ( "0"»  " 

PRE/DE-EMP?  < 1-Y, 0-N> 

16) 

1506 

FORMAT ( "0"# M 

GLOTTAL  PULSE  (1-POLY  ,  3- IMPULSE) 

",  16) 

1507 

FORMAT  <  "O'*.  " 

HAMMING  WINDOW?  <1-Y, 0-N) 

".  16) 

1508 

FORMAT  (  "O'*.  " 

METHOD  PITCH  DET  < O-SIFT. 1-AUTOC) 

",  16) 

1509 

FORMAT < "0".  ** 

PITCH  &  COEF'S  F'M  SAME  FILE(  1-Y.  0-N) 

",  16) 

1600 

FORMAT  < “0",  " 

VOICED/UNVOICED  THRESHOLD 

F12.  5) 

1601 

FORMAT ("0".  " 

SPEECH  SCALE 

F12.  5) 

1602 

FORMAT  < ”0"* M 

SILENCE  THRESHOLD 

F12.  5) 

1603 

FORMAT  < "0", " 

UNVOICED  GAIN  FACTOR 

F12.  5) 

900 

FORMAT ( 3X»  'TEST1  :  ',F10.  5) 

901 

FORMAT < 3X. F12.  5) 

902 

FORMAT  <  3X< 110) 

1000 

TYPE "PROGRAM 

COMPLETED" 

1001 

STOP 

END 

FILENAME:  SCALE.  FR  DATE:  12:  2: 83  TIME:  13: 49: 27  PAGE 

C*********************************************************************** 

C 

C  PROGRAM  SCALE.  FR 

C 

C  THIS  PROGRAM  SCALES  SPEECH  FILES  SO  THAT  THERE  IS  A  MAX  VALUE 

C  OF  1900  AND  CAN  DE-EMPHASIZE  SPEECH 

C 

C  INPUT:  MUST  BE  A  BLOCKED  FILE 

C 

C************************************************************************ 

DIMENSION  SI (256),  U(256) 

DOUBLE  PRECISION  IX 

INTEGER  0UTFILE(7>.  INF1LE(7) ,  FILUFD( 18) »  SPEECH (256) 

IX  -  DBLE ( 203 ) 

FLIP  -1.0 
NNEWS  -  0 

ACCEPT" WARNING:  THE  INPUT  FILE  MUST  BE  AN  INTEGER  FILE  <CR> 

X  AND  BE  IN  BLOCKED  FORM.  <CR>  <CR> 

X  DO  YOU  WISH  TO  CONTINUE? ( 1-Y,  0-N)  ",  NYZ 

IF < NYZ  .  EQ.  0)G0  TO  60 
ACCEPT" INPUT  FILENAME  :" 

READ( 11, 39) INFILE< 1 ) 

39  FORMAT (SI 3) 

OPEN  1.  INFILE,  ATT-"CI",  ERR=40 
FLIP  58  1.0 

ACCEPT" OUTPUT  FILENAME  : " 

READ ( 1 1 . 39 ) OUTF I LE ( 1 ) 

OPEN  2, OUTFILE,  ERR-50 
NDE  »  O 

ACCEPT "OUTPUT  FILE  SIZE?  ",  I SIZE 
ACCEPT"PERFORM  NOISE  ADDITION?< 1-Y, 0-N) ",  NNOIS 
IFtNNOIS.  EG.  0)00  TO  53 

ACCEPT"SIZE  OF  MAX  NOISE? (REAL)  ",VNOSIZ 
53  CONTINUE 

MBLOCK  *  1 
N15  *  O 
NV  »  0 
70  N6  *  0 

DO  80  I  :*1,  256 
SKI)  ■  0.  0 
U(I)  ■  O.  0 
80  CONTINUE 

N5  «  0 

100  CONTINUE 

CALL  RDBLK( 1. NV, SPEECH, MBLOCK,  I ENDS) 

IF(NNOIS. EQ.  0)  GO  TO  110 
CALL  UNVOCD(U,  256,  IX) 

110  DO  200  J-1,256 

IF(NNOIS.  EQ.  0)60  TO  120 
NNEWS  -  I NT ( U ( J ) * VNOS I Z / 2 ) 

IF((J.  EQ.  1 ) .  AND.  (NV.  EQ.  1))  TYPE  "  NOISE  ADDED  " 

120  SPEECH(J)  -  SPEECH(J)  +  NNEWS 

180  N2  -  I ABS ( SPEECH ( J ) ) 

IF(N2  .  CT.  NS)  N5  «  N2 
N6  ■  N6  ♦  1 
200  CONTINUE 

CALL  WRBLK ( 2, NV, SPEECH,  MBLOCK,  I ENDS ) 

NV  -  NV  ♦  1 

IF(NV  .  LT.  ISIZEJCO  TO  100 

6-^0 


500  TYPE"THE  FOLLOWING  NO.  OF  POINTS  WHERE  CHECKED  ",N6 

TYPE "AND  THE  MAX.  VALUE  FOUND  WAS  ", N5 

S2  =  1900.0  /  FLOAT < N5 )  *  FLIP 
N6  =  0 
NV  =  0 

600  CALL  RDBLK < 2, NV, SPEECH, MBLOCK. IENDS) 

DO  700  J=  1,256 

S1<J)  =  FLOAT ( SPEECH ( J ) )  *  S2 
SPEECH(J)  =  I NT  <S1 < J) ) 

700  CONTINUE 

CALL  WRBLK<2, NV, SPEECH,  MBLOCK,  IER ) 

N6  =  N6  +  1 
NV  =*  NV  +  1 

IF < NV  .  LT.  ISIZEJGO  TO  600 
900  CONTINUE 

N15  -  N6  *  256 

TYPE "THE  FOLLOWING  NO.  OF  POINTS  WERE  OUTPUT  ",N15 
CALL  CLOSE < 1, IER) 

IF < IER  . NE.  1 )TYPE"CLOSE  ERROR  ON  INPUT  ",  IER 
CALL  CLOSE (2, IER) 

IF ( IER  . NE.  1 ) TYPE"CLOSE  ERROR  ON  OUTPUT  ",  IER 
TYPE "BLOCKS  PROCESSED:  ”,N6 
GO  TO  60 


50 

TYPE"OPEN 
GO  TO  60 

ERROR 

ON 

OUTPUT 

40 

TYPE" OPEN 

ERROR 

ON 

INPUT 

60 

STOP 

END 
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C************************************************************ 

c 

C  LOAD  LINE: RLDR  TSTRND  DRAND  UNVOCD  PLOTIO  PLOTS. LB 

C  GRPH.  LB  @FLIB@ 

C 

C  THIS  PROGRAM  RUNS  EITHER  THE  UNIFORM  OR  NORMAL  GENERATOR 

C  AND  PROVIDES  A  PLOT < PR INTRONI X  OR  TEKTRONIX)  AND/OR  THE 

C  MEAN  AND  VARIANCE. 

C 

C************************************************************* 


DIMENSION  IT <  500) ,  U( 256) , T(500), XHOR( 128).  YVER< 128).  W(256) 
DOUBLE  PRECISION  INTEGER  IX 
INTEGER  FRMSI Z. NAME1 < 7 ) . NAME2< 7 ) 

ACCEPT  "HOW  -MANY  256  POINT  SETS?  “»  NUM 

NUFRM  *  NUM  *  256 
IX  »  DBLE  <  203 ) 

DO  50  I  =  1,  256 

U< I )  =  DBLE < 0.  0) 

W< I)  =  DBLE(0.  0) 

50  CONTINUE 

DO  100  1  =  1.500 
IT ( I )  =  O 
100  CONTINUE 

I COUNT  =  0 
SUM1  =0.0 
SUM2  =  0.0 
K  =  0 

ACCEPT "CHOOSE  RANDOM  GENERATOR ( 1-NDRMAL.  O-UNIFORM)  ".NORM 
IF < NORM  .  EQ.  1 )G0  TO  1200 
DO  1000  NTIM=1 • NUM 
DO  900  MTIM  =  1.  128 
I COUNT  =  2  +  I COUNT 
PEMP  =  SNGL  <  DRAND (IX)) 

TEMP  =  PEMP  *  500. 

SUM1  =  SUM1  +  TEMP 
SUM2  =  SUM2  +  ( TEMP ) **2 
I TEMP  =  I NT (TEMP) 

I F  <  ( I  TEMP  .  GT.  500).  OR.  (  I  TEMP  .  LT.  0))G0  TO  600 
XHOR(MTIM)  =  TEMP 
IT ( ITEMP )  =  IT  < ITEMP )  +  1 
GO  TO  800 

600  TYPE "DATA  EXCEEDS  BOUNDARY  AT  ". ITEMP 

800  PEMP  =  SNGL ( DRAND (IX)) 

TEMP  »  PEMP  *  500. 

SUM1  =  SUM1  +  TEMP 
SUM2  =  SUM2  +  (TEMP)* *2 
ITEMP  »  I NT (TEMP) 

IF( ( ITEMP.  GT.  500).  OR.  (ITEMP.  LT.  0))  GO  TO  850 
YVER (MTIM)  =  TEMP 
IT( ITEMP)  *  IT( ITEMP)  +  1 
GO  TO  900 

850  TYPE  "  DATA  EXCEEDS  BOUNDARY  AT  ",  ITEMP 

900  CONTINUE 

IF  ((IER.  NE.  1).  OR.  (UER.  NE.  1))  TYPE  "  WRBLK  ERROR  ", IER, JER 

K  *  K  +  1 
1000  CONTINUE 

TYPE  "PRODUCED  UNIFORM  DISTRIBUTION  " 

GO  TO  5000 
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1200  FRMSIZ  =  256 

DO  3000  NTIM=1,NUM 

CALL  UNVOCD(U, FRMSIZ, IX) 

CALL  UNVOCD(W, FRMSIZ, IX) 

DO  2500  NR=1 ,  256 

ITEMP  =  I NT (U(NR )  *  80.0) 

ITEMP  =  ITEMP  +  250  ; CENTERING  FOR  PLOTS 

I F ( ( I TEMP  .  GT.  500). OR.  ( ITEMP  .  LE.  0))G0  TO  1400 
SUM1  -  SUM1  +  FLOAT (ITEMP) 

SUM2  -  SUM2  +  FLOAT ( ITEMP )**2 
XHOR(NR)  -  ITEMP 
IT( ITEMP)  *  IT( ITEMP)  +  1 
I COUNT  ■  I COUNT  +  1 
GO  TO  1600 
1400  CONTINUE 

1600  ITEMP  ■  I NT (U(NR  >  *  80.0) 

ITEMP  =*  ITEMP  +  250  .CENTERING  FOR  PLOTS 

IF(( ITEMP  .  GT.  500). OR.  < ITEMP.  LT. 0) )  GO  TO  2000 
SUM1  *  SUM1  +  FLOAT (I TEMP) 

SUM2  =  SUM2  +  FLOAT ( ITEMP )**2 
IT< ITEMP)  =  IT< ITEMP)  +  1 
I COUNT  =  I COUNT  +  1 
GO  TO  2500 
2000  CONTINUE 

2500  CONTINUE 

K  -  K  +  1 
3000  CONTINUE 

TYPE  "PRODUCED  NORMAL  DISTRIBUTION  " 

5000  CONTINUE 

DO  4000  K  ■  1.500 

T(K)  *  FLOAT ( IT(K) ) 

4000  CONTINUE 

XMEAN  -  SUM1 / I COUNT 

XMEAN2  -  XMEAN**2 

VAR  =  SUM2/IC0UNT  -  XMEAN2 

TYPE  "VAR  =  ",  VAR 

STDEV  *  SORT (VAR) 

C*****»*PLOTS****** 

ACCEPT" DO  YOU  WANT  A  PLOT?( 1-Y. 0-N)  ". NYES 
IF (NYES.  NE.  1 )  GO  TO  5600 

ACCEPT" USE  PRINTRONICS  PLOTTER? ( 1-Y, 0-N >  ".NO 
IF  (NO.  EG.  0)  GO  TO  5500 
NP  -  1 
SF  -  1.0 
NPTS  »  500 

CALL  PL0T10(T,  NPTS,  NP,  XO,  YO,  SF) 

NP  ■  10 

CALL  PL0T10(T, NPTS, NP,  XO,  YO,  SF) 

GO  TO  5600 
5500  IFSCL  -  0 

MODE  -  0 
NO  -  1 
N  -  500 

CALL  GRPH2( "DENSITY".  NO.  T,  U,  N,  MODE.  YM.  YA.  IFSCL) 

5600  CONTINUE 

TYPE  "  MEAN  -  ",  XMEAN,  "  STDEV  -  " , STDEV 

STOP 

END 
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APPENDIX  D 


Code  foe  Subroutine  LATTICE 


D-l 


OOUOUU  XXX 


FILENAME:  LATTICE.  FR  DATE:  12:  3: 83  TIME:  14:  47: 38  PAGE 

C  ***»*************##******##***************-*********-****####****** 


THIS  SUBROUTINE  CALCULATES  THE  PREDICTOR  COEEFICIENTS 
BY  THE  LATTICE  METHOD  AS  PRESENTED  ON  PP  411-416  OF 
RABINER  ?<  SCHAFER. 


***««#*»*******«**«*#***«***#*«**«**««********«*»««««#«#*««**#**«* 


SUBROUTINE  LATTICED,  X.  POLES.  A,  ALPHA, K) 

DIMENSION  X(l). A<1) 

DOUBLE  PRECISION  B < 0: 400 ) , E ( 400 ) , DA ( 20 ) , DK ( 20) , DAL, RO 
DOUBLE  PRECISION  TEMPI,  TEMP2,  SUM1,  SUM2.  EM,  D2 
REAL  K ( 1 ) 

INTEGER  POLES 

CALL  OVERFL ( IFL02 ) 

IF ( IFL02.  EQ.  1 )  TYPE  "  OVERFLOW  IN  PREDICT  " 

IF < IFL02.  EQ.  3)  TYPE  “  UNDERFLOW  IN  PREDICT  " 

DO  10  I  =  1, POLES 
DA  < I )  =  DBLE  <  0. O) 

DK  (  I )  =  DBLE  ( 0.  0) 

10  CONTINUE 

KNE  =  1 

D2  =»  DBLE  (2.  O) 

DAL  =  DBLE  (0.0) 

B(0)  =  DBLE(0. 0) 

DO  20  I  =  1 ,  N 

DAL  -  DAL  +  DBLE( X ( I ) *X ( I > ) 

20  CONTINUE 

X  DAL  =  IDO 

DAL  =  DAL/1D04 
X  RO  =  DAL 

DO  30  M  =■  1 .  N 

E ( M )  =  DBLE ( X  <  M ) ) 

B ( M )  =  DBLE( X(M) ) 

30  CONTINUE 

SUM1  =  DBLE(0.  0) 

SUM2  =  DBLE (0.0) 

DO  40  M  =  1 ,  N 
Ml  =  M  -  1 
TEMPI  =*  E ( M ) *B ( Ml  ) 

TEMP2  =  (E(M) *E(M) )  +  (B (Ml )*B (Ml ) ) 

SUM1  =  SUM1  +  TEMPI 
SUM2  *  SUM2  +  TEMP 2 
40  CONTINUE 

DK( 1 )  -  D2*SUM1/SUM2 

IF ( DABS ( DK ( 1 ) ) .  GT.  DBLE ( 1 .  ) ) TYPE  "  ERROR  " 

X  TYPE  "  DK( ", KNE, " >  ■  " , DK ( 1 ) 

DA ( 1 )  =  DK ( 1 ) 

DO  200  I  «  2, POLES 
11=1-1 
DO  50  M  »  l.N 
Ml  -  M  -  1 
EM  -  E(M) 

E(M)  -  EM  -  DK( II )*B(M1 ) 

B(M)  -  B(M1 )  -  DK (II) *EM 

IF(DABS(E(M) >.  LE.  ID-15)  E(M)  -  DBLE(0.  0) 

IF(DABS(B(M) ).  LE.  ID-13)  B (M)  *  DBLE(0.  0) 

50  CONTINUE  0-2 
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sum  =  DBLE<0.  0) 

SUM2  =  DBLE<0.  0) 

DO  60  M  =  1.  N 
Ml  =  M  -  1 
TEMPI  =  E <  M)  *B (Ml > 

TEMP2  =  <E<M)*E<M>>  +  <B (Ml ) *B < Ml ) ) 

SUM1  =  SUM1  +  TEMPI 
SUM2  =  SUM2  +  TEMP2 
60  CONTINUE 

CALL  OVERFL  < I FL02 ) 

IF <  IFL02.  EQ.  1 )  TYPE  "  OVERFLOW  IN  SUM 
IF< IFL02.  EQ.  3)  TYPE  "  UNDERFLOW  IN  SUM 
DK  <  I  >  =  D2*SUM 1 / SUM2 

IF<DABS<DK<  I )).  GT.  DBLE( 1.  ) )TYPE  "  ERROR  " 
TYPE  "  DK  <  “ j  I,  "  )  =  'SDMI) 

DA<  I )  =  DK<  I ) 

DO  80  J  =  Ml 

DA<U>  =  DA < J )  -  DK< I )*DA< I— J) 

SO  CONTINUE 

DAL  ■  DAL  -  DK< I >*DK< I >  *DAL 
200  CONTINUE 

CALL  OVERFL <IFLO) 

IF< IFLO.  EQ.  1 )  TYPE  "  OVERFLOW  IN  LATTICE  " 
IF< IFLO.  EQ.  3)  TYPE  "  UNDERFLOW  IN  LATTICE  " 
DO  250  M  =  1,  N 

Ml  =  M  -  1 

E<M)  =  E<M)  -  DK< II )*B  <M1 ) 

IF<DABS<E<MJ ).  LE.  ID-20)  E<M)  =  DBLE<0.  0) 
250  CONTINUE 

DO  300  M  «  l.N 

DAL  »  DAL  +  E  <  M ) *E  <  M ) 

300  CONTINUE 

ALPHA  =  SNGL  <  SQRT  <  R0*DAL  > ) 

ALPHA  »  SNGL < SQRT < DAL > ) 

TYPE  "  ALPHA  =  ".ALPHA 
DO  100  I  =  1,  19 
I MU  *  21  -  I 

A  < IMJ )  =  —SNGL  <  DA  < IMJ— 1 ) ) 

K  < I )  »  SNGL  <  DK  < I ) ) 

100  CONTINUE 

A<  1  J  »  1.0 

K<20)  a  SNGL  <  DK  <  20 ) ) 

CALL  OVERFL < I FLO 1 ) 

IF<IFL01.  EQ.  1 )  TYPE  "  OVERFLOW  IN  ALPHA 
IF< IFL01.  EQ.  3)  TYPE  "  UNDERFLOW  IN  ALPHA 
ACCEPT " CONT I NUE  ON  A  NUMBER". IUKL 
RETURN 
END 
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